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DIFFERENTIAL PERCEPTION OF CERTAIN JOBS 
AND PEOPLE BY MANAGERS, CLERKS, AND 
WORKERS IN INDUSTRY ' 


HARRY C 


T 


RIANDIS ” 


Cornell University 


The recent development of the semantic 
differential by Osgood and his associates (Os- 
good, Suci, & Tannenbaum, 1957) has pro- 
vided a procedure of great simplicity and 
flexibility for the study of the frames of refer- 
ence of industrial subjects. Ss are required 
to rate a series of concepts on a series of 
scales. The means of the ratings for a given 
group provide information about the group 
frame of reference. Weaver (1958) compared 
the meaning of 10 concepts for members of 
management and labor leaders and found sig- 
nificant differences in the meanings of the 
concepts “the closed shop,” “grievance,” “the 
labor movement,” “working during a strike,” 
“labor in politics,” and other concepts, be- 
tween the two groups. The present study 
describes the use of the semantic differential 
for the study of how certain jobs and certain 
people are perceived by various groups of 
industrial Ss. 


Method 
Technical Note 


Osgood and his associates used mostly college stu- 
dents as Ss. The writer’s attempt to use the seman- 
tic differential with workers suggested that these Ss 
find it extremely difficult to respond to “unusual” 
combinations of concept and scale (e.g., Joe Dow 
rated on angular-rounded). For this reason, special 
differentials for jobs and for people were developed 
The procedure for the development of the differen- 

1 This paper is based on portions of the writer’s 
doctoral dissertation. The author gratefully acknowl 
edges the guidance and help of W. W. Lambert, 
T. A. Ryan, and W. F. Whyte. The larger study, 
of which this is a part, was supported by a grant 
from the Foundation for Research on Human Be 
havior. 

“Now at the University of MUlinois. 
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tials was as follows: First, 12 triads of jobs and 12 
triads of people were presented to 105 industrial Ss 
(20 workers, 30 male clerks, 30 female clerks, and 
25 managers). The Ss were asked: “Which one of 
these three jobs (people) is different from the other 
two and why?” (e.g., triad: teacher, welder, clerk 
Response: teacher is professional, or welder is manual, 
or clerk is routine worker). “What is the opposite 
of this characteristic?” (eg., unprofessional, 
white collar, or variable). The lists of the character- 
istics obtained from the various groups of Ss dif- 
fered from each other. An analysis of these lists has 
been published elsewhere (Triandis, 1959). A strati- 
fied random sample of the characteristics so obtained 
constituted 28 scales of each of the semantic differ- 
entials. An additional 10 scales were selected from 
Osgood, Suci, and Tannenbaum (1957), as to 
represent the seven factors of their semantic differ 
ential factor analysis. The sheets of paper used for 
this test could accommodate only 38 scales. The 
differentials and the instructions that were finally 
used may be found in Triandis (1958, pp. 296-298). 


+4 


2 


or 


so 


Procedure 


The two semantic differentials, one for 
jobs and one for people, which were developed as is 
described above, were administered to 156 Ss. Usable 
answers were received from 5 members of the com 
pany’s executive committee, 14 department managers, 
18 section managers, 32 female clerks, 28 male clerks, 
and workers. The Ss rated five jobs (welder, 
teacher, vice-president, personnel director, and clerk) 
in counterbalanced order on the semantic differential 
In addition they rated their supervisors, the com 
pany’s personnel director, the boss of their super 
visor, the vice-president of their division, a “fellow 
at work whom you like,” and “an effective manager 
you have known well and who is not the same 
any of the people already rated.” 


38-scale 


55 


as 


Results and Discussion 


The means of the ratings of the various 
jobs and people on the 38 scales of the seman- 
tic differential, for groups consisting of upper 
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managers, lower managers, female clerks, male 
clerks, and workers, can be found elsewhere 
(Triandis, 1958). Limitations of space do 
not permit complete presentation of the find- 
ings. A general observation, however, is pos- 
sible; the means of the various groups on 
most of the scales of the differential are very 
similar. Against this background of simi- 
larity, however, it is possible to note several 
important differences. 
Differences in the Perception of People 

A comparison of the two “ideal” concepts, 
the workmate and the manager, reveals great 
consistency between the groups and between 
levels of each of the divisions. Both ideals 
are considered successful, though the manager 
is a little more successful than the workmate. 
The ideal manager is purposeful, while the 
workmate does not have this trait. Both are 
easy to understand, stable, educated, kind, 
ambitious, though the manager is a little more 
ambitious than the workmate; gracious, 
though the workmate a little more than the 
manager; receptive, capable, active, colorful, 
cooperative, original, though the manager a 
little more than the workmate; experienced, 
young, friendly, intelligent, aggressive, though 
the manager a little more than the workmate; 
skilled, progressive, powerful, though the man- 
ager a little more than the workmate; so- 
ciable, very concerned with public relations, 
like traveling, good-humored, self-made, and 
just slightly emotional. The workmate’s pay 
is average, the manager’s high. The manager 
is more talkative. The workmate does not 
have too many headaches on the job, the man- 
ager does. Finally, the workmate is more 
satisfied than the manager; the latter is at 
times very dissatisfied. 


The Characteristics of the Successful and 
Unsuccessful Supervisors 


In this section we will undertake to answer 
the following question: Suppose there are two 
department heads who are considered as suc- 
cessful and relatively unsuccessful respectively 
by their supervisor (a vice-president). How 
are these two men perceived by their subordi- 
nates? Let us call them Mr. Effective (E), 
and Mr. Ineffective (1). Both are perceived as 
being successful. FE is perceived to be more 


purposeful than I. But I is easier to under- 
stand. Both are stable, educated, ambitious, 
strong, capable, active, quiet, colorful, co- 
operative, conventional, get high pay, experi- 
enced, young, intelligent, satisfied, aggressive, 
fast, skilled, progressive, powerful, sociable, 
and share most of the other characteristics in 
equal amounts, yet E is crude while I is gra- 
cious, Eis assuming, and I unassuming, E is 
stubborn, and I is receptive, I is more friendly 
and good humored, E is more unfriendly and 
bad humored. In short, I is closer to the pic- 
ture of the ideal manager of most groups, 
while E is more instrumental and task-ori- 
ented. We might conclude, then, that the 
particular vice-president is more task-oriented 
than employee-oriented. Three out of four 
of the vice-presidents of this company were 
similarly task-oriented. 

Another comparison considered the profiles 
of 11 department heads who are well liked 
and 3 department heads who are disliked by 
their subordinates. : The disliked department 
heads were perceived as being more difficult 
to understand, cruel, crude, stubborn, unco- 
operative, inexperienced, unfriendly, dissatis- 
fied, and assisted. 


The Meaning of the Similarity Between the 
Perception of the “Actual” and the 
“Tdeal” Supervisor 
It is reasonable to expect that the greater 

the similarity between the profiles of the ac- 

tual and the ideal supervisors, the more will 
be the liking of the subordinate for the super- 
visor. This would imply that Ss like those 
people who perform their role in society in 
such a way that their behavior approaches the 
ideal expected behavior for the particular role, 
as the latter is perceived by the Ss. To test 
this view, the similarity in the profiles of the 
actual and the ideal supervisors was correlated 

to ratings of liking of the supervisor on a 

Thurstone-type successive intervals scale (Ed- 

wards, 1957, pp. 120-145). The profile simi- 

larity was computed as follows: 


>: ad DP 


' = 
! 36. 1 
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where d is the difference between the percep- 
tion of the actual and the ideal supervisors 
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Table 1 


Distance Between ( 


Welder’s Job and Clerk’s Job 

Welder’s Job and Teacher’s Job 

Welder’s Job and Personnel Director’s 
Welder’s Job and Vice-President’s Job 
Clerk’s Job and Teacher’s Job 

Clerk’s Job and Personnel Director’s 

Clerk’s Job and Vice-President’s Job 
Teacher’s Job and Personnel Director’s 
Teacher’s Job and Vice-President’s Job 
Personnel Director’s and Vice-President’s Job 


* All distances adjusted so that the same N (N 


on one of the » scales of the semantic differ- 
ential. D is the same as the D-statistic used 
by Osgood et al. and others. In our case 
n = 38. The 36 is a constant which comes 
from the fact that a 7-point scale was used. 

The Pearson r coefficients between S, and 
L (liking for the supervisor) were as follows: 
For 42 managers and top clerks r = .73 (p 
< .0001); for 50 clerks r = .58 (p < .001); 
and for 50 workers r= .54 (p< .001). We 
conclude that our hypothesis about the rela- 
tionship of “ideal” and actual behaviors to 
liking is confirmed. 


Differences in the Perception of Jobs 


There is a tendency for the workers to per- 
ceive a Welder’s Job as involving more experi- 
ence, and being more desirable, important, re- 
sponsible, alert, difficult, professional, execu- 
tive, creative, skilled, and doing more things, 
as well as less routine, as compared to the 
other groups. In other words, there is a tend- 
ency to idealize, or overevaluate, this factory 
job. This suggests that any tendency of 
management to minimize the importance of 
this job will be perceived as offensive. This 
finding is consistent with a case study by 
Whyte (1956) in which it was shown that a 
vice-president’s remark that a certain skilled 
job was “just a watchman’s job” was so in- 
furiating to the workers that they joined the 
C.1.O. in large numbers. The findings sug- 
gest that management ought to consider the 


D Matrix for the Perception of Five Jobs by Five Groups * 


Male 
Clerks 
(N = 33) 


Female 
Clerks 
(N = 32) 


Low 
Mgrs. 
(N = 25) 


Top 
Mgrs. 
N = 17) 


Worker 
(N = 56) 


62 
64 
68 
76 
66 
69 
84 
37 
44 
37 


48 
68 


59 
63 
74 73 
64 
60 


58 
50 
45 


17) is used throughout 


tendency of workers to value their jobs more 
than management values them, in its com- 
munications to them. In the case of clerks, 
rating a Clerk’s Job, however, the data did 
not reveal any tendency towards overevalua- 
tion. 


ee. Y 


w 


17 Upper Managers 25 Lower Managers 


56 Workers 


32 Female Clerks 


Fic. 1. Relationships between five jobs as viewed 
by different groups. 

Note.—V stands for Vice-President’s Job; P, Per 
sonnel Director’s Job; 7, Teacher’s Job; C, Clerk’s 
Job; W, Welder’s Job 
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Osgood et al. (1957, p. 244 ff.) have de- 
scribed how the data from the semantic dif- 
ferential can be used to compute distances 
between concepts. The greater the D-sta- 
tistic, or distance between two concepts, the 
more different the two concepts seem to be to 
the Ss doing the judging on the semantic dif- 
ferential. Osgood’s procedure was used to de- 
termine the distances between the five jobs 
studied in our field project. The D-matrix 
is shown in Table 1. Two-dimensional draw- 
ings of the three-dimensional forms con- 
structed from the D-matrix are shown in 
Fig. 1. 

The only major difference in the perception 
of the five jobs from group to group, as re- 
vealed by Fig. 1, is in the position of a Clerk’s 
Job relative to the other jobs. The workers 
view it in rather “exalted” terms, the top 
managers see it as the most different job as 
compared to the prestigeful Vice-President’s 
Job. Perhaps the large number of dissatis- 
fied clerks in the particular company is due 
to this perception of top management. Ap- 


proximately 30% of the clerks dislike their 
supervisors; this percentage is quite high for 
this company. 


It is interesting to notice that the most 
meaningful way to represent the job percep- 
tions of these groups is to draw a perpendicu- 
lar line between a Welder’s and a Vice-Presi- 
dent’s Job. It means that the most signifi- 
cant variable in the perception of jobs is the 
level of the job. 

It is also interesting to note that the man- 
agers make finer discriminations between jobs 
than do the workers. This is seen from the 
fact that the Ds obtained from the top man- 
agers are consistently higher than the Ds ob- 
tained from the workers. 


The Perception of the Man and the Job 


To what extent does the man determine the 
perception of a job, and the job the percep- 
tion of the man doing it? This is a complex 
question. We have only made a beginning 
towards answering it, but we did collect some 
data that are interesting. 

The 155 employees who participated in the 
study rated Mr. T., the Personnel Director, 
also the Personnel Director’s Job on the se- 
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mantic differential. The reader will recall 
that the semantic differential for people and 
the one for jobs were not the same. Never- 
theless, we did have a few scales which were 
equivalent. 


Job S.D. Scales People S.D. Scales 


Evaluative 
High-low position Successful-unsuccessful 
Requires much-no Educated-uneducated 
education 
Requires much-no Experienced-inexperienced 
experience 
High-low pay Gets high-low pay 
Sociable-unsociable Sociable-unsociable 
Requires much-no Intelligent-unintelligent 
intelligence 
Clean-dirty Gracious-crude 


Good-bad Good-bad humored 


Potency Scales 
Heavy-light Weak-strong 


Important-unimportant Powerful-powerless 


Activity Scales 
Active-passive \ctive-passive 

Osgood et al. (1957, p. 91) D-scores were 
obtained from the discrepancies in the rat- 
ings of these corresponding scales. The reader 
will notice that the scale correspondence is 
not very close. The D-scores that were com- 
puted were very small. This is rather re- 
markable in view of the very rough corre- 
spondence between the scales. It appears 
that the perceptions of the job and the man 
are very Closely interrelated. The job ac- 
quires the characteristics of the man and the 
man the characteristics of the job. 

From consideration deriving from Osgood 
et al. (1957), on the reliability of the se- 
mantic differential, we conclude that, for our 
particular case, a D smaller than 48 means 
that there is complete fusion of the job and 
the man. For 80% of the top management, 
62% of the middle management, 68% of the 
lower management, 47% of female clerks, 
58% of the male clerks, and 56% of the 
workers there was such complete fusion. 

A reasonable hypothesis, it seemed, was 
that if a person had never had any experience 
with personnel men, in other words if he 
never worked before in a setting in which 
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there was a personnel man, this fusion be- 
tween job and man doing the job would be 
even more complete. Surprisingly, this hy- 
pothesis cannot be supported by the data. In 
fact, if there is a relationship it runs in the 
opposite direction (p < .25, chi square, two- 
tail test). 


Summary 


Five jobs and 6 people were rated on 38 
scales of corresponding semantic differentials 
by 156 Ss representing various groups in in- 
dustry. The differences in the perception of 
the jobs and the people are discussed. 


Received January 12, 1959 
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EFFECTS OF ALTERING TASK COMPONENTS ON 
PERCEPTUAL-MOTOR TASK LEARNING ‘* 


JOHN A. WHITTENBURG,? SHERMAN ROSS, anno T. G. ANDREWS 


University of Maryland 


The rapid increase in the number and com- 
plexity of man-machine systems in contempo- 
rary military and industrial operations has 
demanded new attention to the problem of 
training human operators. Among other basic 
aspects of the training problem is a require- 
ment for determining the skills involved in 
the learning of complex perceptual-motor 
tasks. Knowledge of these skill requirements 
can contribute both to the development of 
efficient training procedures and to the de- 
velopment of effective task simulators to be 
used during the initial stages of training. 

One fundamental problem is the relative in- 
fluence of different task characteristics on the 
learning of complex tasks in man-machine 
systems. The purpose of this study is to 
assess the effects of altering operational com- 
ponents of a perceptual-motor task on the 
acquisition and retention in early and later 
stages of learning. The following general hy- 
pothesis was formulated: The acquisition and 
retention of various perceptual-motor skills 
can be influenced differentially by altering 
different display-control relationships of the 
same task and by introducing those altera- 
tions at different stages in the learning of that 
task. 

Several investigations of learning on per- 
ceptual-motor tasks indicate that the intro- 
duction of changes in a task produced posi- 
tive or negative transfer effects, and that these 
effects are in part a function of the level of 
learning attained prior to the introduction of 
the “new” task (Andreas, Green, & Spragg. 
1953; Duncan, 1953; Lewis, McAllister, & 
Adams, 1951; McAllister, 1952). Previous 
studies also showed that difficulty differences 
between tasks often resulted in differential 


1 This study is one of a series on behavior decre- 
ment supported by the Research Division, Office of 
The Surgeon General, Department of the Army, Con 
tract No. DA-49-007-MD-222 to The University of 
Maryland 

“Now 
ton, Va. 


at Human Sciences Research, Inc., Arling 
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transfer effects (Adams & Lewis, 1949; 
Bilodeau, 1952; Jones & Bilodeau, 1952). 
An open question from these results is whether 
or not the differential transfer effects may be 
attributed in large part to variations in task 
difficulty between the original and the new 
task rather than to changes in the task char- 
acteristics. In addition, most studies deal 
with the effects of altering. single task com- 
ponents on the learning of the interpolated 
or “new” task (Carter, 1947; Gibbs, 1953: 
Helson, 1949). Since we have no satisfac- 
tory way to equate the level of learning pro- 
duced by X trials on one task with that pro- 
duced by Y trials on another task, little sys- 
tematic knowledge is available in regard to 
the effects on the learning process of altering 
different task components at different stages 
in learning. 

In order to test the above hypothesis and 
to provide a method for gaining information 
regarding the relative influence of different 
task characteristics on the learning process, 
a perceptual-motor task requiring continuous, 
compensatory control-display interactions was 
developed. This task was designed so that a 
number of task components could be syste- 
matically and immediately varied during 
learning without changing the nature or diffi- 
culty level of the task. These task compo- 
nents were experimentally varied at each of 
two stages of learning to determine their ef- 
fects on acquisition and retention. 


Method and Procedure 


Apparatus. The apparatus consisted of an elec 
tronic compensatory tracking device which required 
S to maintain a target indicator in the center of a 
5-in. oscilloscope. The target indicator was a }-in 
circular green disc of light which increased abruptly 
in brightness and in size to }-in. when off target 
The “on target” position shifted randomly 2.5 cm 
along the vertical axis during a given trial. The S 
was required to locate the on target position within 
the center area of the face of the oscilloscope and to 
make appropriate compensatory movements to re- 
main on target. In addition to the random displace- 
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ment of the on target position during a given trial, a 
cam device driven by a 1-rpm motor resulted in 
variable changes in the direction and amount of 
vertical movement of the target indicator. The 
maximum vertical displacement of the target indi- 
cator by the cam device was 2.9 cm. above and be- 
low the center of the face of the oscilloscope. S was 
required to compensate for the cam driven vertical 
movements as well as to shift continuously the po- 
sition of the target indicator to stay on target. To 
perform this task, S was provided a 2-in. knob 
mounted horizontally on a shaft and positioned di- 
rectly below the face of the oscilloscope. Turning 
the knob in both directions and in varying amounts 
altered the direction and extent of movement of the 
target indicator. A trial lasted 1 min. and was ter- 
minated by a micro-switch operated by the cam 
Another trial started when E depressed a contact 
switch which activated the 1-rpm motor. The cri 
terion measure of performance was time on target 
recorded by a chronoscope sensitive to .01 sec. The 
chronoscope was activated only when the S was on 
target and the time was cumulated over each 1-min 
trial. 

Task components. Three major functional com- 
ponents of perceptual-motor tasks were selected for 
the task to be learned: (a) directional relationships 
between control changes and display changes; (b) 
rate of change relationships between control and dis- 
play presentation, and (c) relative torque relation- 
ships between control and display movements. These 
three task components may be characterized as form- 
ing integral parts of the operation of many percep 
tual-motor tasks 

The display-control directional relationship was im- 
plemented by linking a horizontal movement of the 
control knob to a vertical movement of the target 
indicator. Using a switch to change the polarity in 
the circuit instantly reversed the directional relation 


ship. The display-control rate relationship was im 


Table 


Task Learning 


plemented by selecting certain values expressing the 
relationship between amount of control movement to 
the amount of target indicator movement. Two re 
sistors of appropriate values were placed in the cir- 
cuit and use of a toggle switch permitted the two 
rate relationships to be altered instantaneously by £. 
To instrument the display-control torque relation- 
ship, a magnetic fluid clutch was coupled to the con- 
trol shaft by means of gears. The relationship be 
tween control knob movement and target indicator 
movement was accomplished in terms of differential 
torque in the control system to the extent that S 
was off target. The farther S was off target, the 
greater the amount of torque present in the control 
movement. To vary the torque values a variable 
resistor was placed in the circuit. A knob attached 
to the resistor operated by E permitted a rapid se 
lection to be made between the two torque values 
There are eight separate tasks possible when the 
three display-control components are varied in each 
of two ways. To facilitate discussion of these tasks 
a coding scheme was constructed to illustrate the 
different tasks in terms of their distinct display-con- 
trol characteristics (see Table 1). These eight tasks 
were used in a pilot study to answer specific ques 
tions asked below and four of these task variations 
were used in the main experiment 


Pilot Study 


Purpose. A pilot study was done to ob- 
tain information regarding two requirements 
for the perceptual-motor task: namely, sig- 
nificant practice effects in one experimental 
session which would permit a meaningful dis- 
tinction to be made between the early and 
later learning stages, and comparability in 
difficulty between variations in the task com- 
ponents. 


I 


Scheme for Coding the Display-Control Variables 


Display-Control Variables 


Rate Variable 
1-1 Ratio (Control to Displa 
b. 1.5-1 Ratio (Control to Display 
Direction Variable 
a. Control. Clockwise Direction 
Display-Upward Direction 
ly. Control-Clockwise Direction 
Display-Downward Direction 
Porque Variable 
i. Maximum of 4.5 in. oz 
hb. Maximum of 11.5 in. oz 


Cocke Coded Tasks 


PaDaka 
TaDaRb 


TaDbRa 
raDbRb 
rbDaRa 
PhbDaRb 


PbDbRa 
rbDbR} 
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Subjects and procedures. The data for 
both the pilot study and the main experiment 
were collected at the U.S. Army Aberdeen 
Proving Grounds in Aberdeen, Maryland. 
The sample came from a population of male 
soldiers who had just completed basic train- 
ing and were awaiting reassignment to vari- 
ous schools for specialized training. The sam- 
ple was drawn from a Casual Company lo- 
cated on the base. Eighty Ss were used in 
the pilot study, 10 Ss for each of the eight 
task variations shown in Table 1. Each S 
was given 20 one-minute trials on the task. 
There was 25 secs. between trials. Immedi- 
ately preceding each trial a small light was 
turned on above the face of the oscilloscope 
which informed S that 3 secs. later E would 
say “start” and the trial would begin. Each 
trial began with the target indicator in the 
on target position. 

Pilot study results. Determination of 
whether significant practice effects did occur 
in 20 trials for the eight task variations was 
made as follows: To stabilize the time per 
trial estimates for a statistical analysis of 
practice effects, the trials were grouped in 5 
blocks of 4 trials each. Thus, total time 
on target scores for each S was summated 
through 4 trials at a time. The analysis of 
time on target data was performed by a “re- 
peated measurements” analysis of variance 
technique presented by Edwards (1950). The 
results of the statistical analysis demonstrated 
highly significant practice effects for all task 
variations. 

The second question concerned the com- 
parability in difficulty among the eight task 
variations. For statistical analysis of the 
data, total time on target scores for each S 
was summated through all 20 trials and 
treated as the score. The statistical analysis 
was made by using the same repeated meas- 
urements analysis of variance technique men- 
tioned above. The results demonstrated no 
significant differences among the task varia- 
tions with regard to over-all ease of learning 
as measured by total time on target. On the 
basis of the results obtained from the pilot 
study, the requirements of the task for sig- 
nificant practice effects in one experimental 
session and comparability in difficulty among 
the eight task variations were satisfied. 


. Ross, and T. G. Ross 


Main Experiment 


Selection of task components. One task 
was randomly selected as the standard: task 
on which each S initially practiced and three 
experimental tasks were selected so that each 
involved one variation of the three ‘com- 
ponents found in the standard task. The 
standard task had the display-control coding, 
TaDbRa: a maximum torque of 4.5-in. 0z., 
a contro] movement in the clockwise direction 
and the display movement in the downward 
direction, and a 1 to 1 ratio of control to 
display movement. The experimental tasks 
included TbDbRa (manipulation of torque), 
TaDaRa (manipulation of the directional re- 
lationship), and TaDbRb (manipulation of 
the rate relationship). 

Experimental design. The pilot study re- 
sults provided a general basis for identifying 
early and later learning stages of the task. 
Most rapid learning occurred during the initial 
5 trials with little increase in proficiency after 
about the 15th trial. Hence, trials 1-5 were 
considered the early learning stage and trial 
15 was designated the last trial before inter- 
polation of the altered task conditions for the 
later learning condition. It was assumed that 
five interpolated trials on the experimental 
tasks would be sufficient to test the hypothe- 
sis. We believed that this arrangement would 
reduce the possibility of having either too few 
trials to provide an adequate test of the hy- 
pothesis, or too many trials resulting in pos- 
sible overlearning of the interpolated task 
conditions (McGeoch, 1952; Osgood, 1953). 
To assess the effects of altering task compo- 
nents at different stages of learning on the 
acquisition and retention of the task-required 
skills, the number of trials for both condi- 
tions was extended to 35. This increase in 
the number of trials was designed to maxi- 
mize the likelihood of achieving a plateau in 
the learning functions for both stages of 
learning and to permit a direct statistical 
comparison between the early and later con- 
ditions using the performance data obtained 
on the last few trials. Based on the above con- 


siderations, an experimental design was for- 
mulated to test the hypothesis (see Table 2). 


Subjects. One hundred and eight Ss drawn 


from the same population as in the pilot study 
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Table 2 


The Experimental Task Variations During Early and Later Learning Stages 


Mas 


Early Learning Standard 


TbDbRa 


Trials 


10 | ere 6... ai. 


Standard Standard Standard 


TaDaRa 
TaDbRb 
Control 


Later Learning Standard 


were used. Ss were randomly assigned to 
three conditions: 48 to the early learning, 48 
to the later learning, and 12 to the standard 
condition: The 48 assigned to both early and 
later learning conditions were assigned ran- 
domly to the three experimental groups. One 
group of 12 Ss was designated as a control 
group, consisting of no practice on the task 
during the interpolated period. The Ss in 
the control group were given magazines to 
read during the equivalent amount of time 
that the experimental groups performed on 
the altered task conditions. There were 35 
trials for each S. Each trial lasted 1 min. 
with 25 secs. between trials. Time on target 
was used as the primary measure of learning. 


Results 


Comparability of Ss. An initial determina- 
tion was made of the comparability of the Ss 
assigned to the standard, control, and the 
three experimental tasks for both the early 
and later learning conditions. A single clas- 
sification analysis of variance of total time on 
target scores for the initial five trials for all 
Ss was made. The results indicated that the 
Ss assigned to the different conditions did not 
differ significantly in total time scores for the 
five initial trials on the standard task. 

Sensitivity of statistical analysis. To in- 
crease the sensitivity of statistical analyses, 
time on target scores for all Ss for the first 
five trials on the standard task were used to 
adjust scores for individual performance dif- 
ferences on each condition. A total score for 
the first five ‘trials was obtained for each S. 


Standard 


Standard Control 

TbDbRa 
TaDaRa 
TaDbRb 


Standard 


These scores were used to adjust scores for 
testing the experimental hypothesis by analy- 
sis of covariance procedure (Johnson, 1949). 

The effects of variations in task conditions. 
This aspect of the hypothesis was tested in 
two ways. First, it was reasoned that if 
variations in task conditions differentially af- 
fected the acquisition and retention of the 
task-required skills, criterion scores on the 
standard task following the experimental 
treatments should differ significantly as a 
function of the interpolated practice on the 
altered task components. Adjusted time on 
target scores subsequent to experimental vari- 
ations of the task components were analyzed 
for both early and later learning conditions. 
To obtain relatively stable time scores, the 
total time on target scores for the 5 posttrials 
were used and an analysis of covariance was 
performed. Analysis of criterion scores for 
both early and later learning conditions failed 
to reveal significant differential effects of al- 
tering task components on the performance of 
the standard task (see Fig. 1 and 2). A fur 
ther analysis using only the first posttrial for 
early and later learning conditions was per- 
formed. The results of this analysis showed 
differences in time on target scores that were 
significant beyond the .05 level for the early 
learning condition (see Table 3). No signifi- 
cant differences obtained on the first 
posttrial for the later learning conditions. In 
spection of the adjusted means for the early 
learning condition indicated that changing the 
torque characteristic 
on the 


were 


facilitated performance 


standard task The findings also 
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ALTERED TASKS 


(SECS.) 





AV. TIME OWN TARGET 








16-20 
TRIALS 


"1G. 1. Mean time on target for early learning. 


ALTERED TASKS 


La 
Fe 


(secs. ) 





STANDARD x——x< | 








TORQUE 





AV. TIME ON TARGET 


$$ SE 


16-20 21-25 “26-30 31-35 
TRIALS 


Mean time on target for later learning 
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Table 3 


Analysis of Covariance of First Posttrial During Early Learning and Unadjusted and 
Adjusted Means of First Posttrial for Each Condition 


Sum of Mean 
Source of Variation if Squares Square 
Between Conditions i 31,014.00 10,338.00 
Within Conditions : 118,117.36 2,746.92 


Total 149,131.36 


Adjusted Means 
Condition Unadjusted Means (Secs.) 


Standard 31.07 31.07" 
Control 30.65 27.33 
Rate Change 26.32 27.37 
Direction Change 28.57 27.73 
Torque Change 32.55 35.65 


® Due to lack of homogeneity of variance between standard and other conditions on the first posttrial, this condition was 
omitted from the analysis. 


showed that changes in rate and directional score (see Table 4). The results showed sig- 
task characteristics led to transient interfer- nificant differences for the early learning con- 
ence effects on the subsequent standard task. dition. The findings were identical for the 
Although the hypothesis was supported by _ interpolated period as for the subsequent re- 
this latter analysis, the differential effects of learning trials on the standard task. Altering 
altering task components on subsequent learn- torque facilitated performance, while chang- 
ing were both transitory and evident only ing rate and directional relationships pro- 


during the early learning condition. duced relative interference on the altered task 

A further analysis was carried out on the’ conditions. An analysis of covariance on the 
effects of altering task characteristics on the interpolated task scores during the later 
learning process. The adjusted time scores learning condition failed to yield significant 
obtained on the “new” tasks during the in- differences among the task variations. A ¢ 
terpolated practice period were analyzed for test was performed between the scores made 
early and later learning conditions. Total on the reversed display-control directional 
time on target for each S was used as the characteristic and the standard task during 


Table 4 


Analysis of Covariance of Total Time on Target Scores for Interpolated Tasks During 
Early Learning and Unadjusted and Adjusted Means for Each Condition 


Sum of Mean 
Source of Variation Squares Square 
Between Conditions j 981,847.66 327,825.53 
Within Conditions . 1,884,287.65 43,820.64 


Total 2,866,135.31 


Adjusted Means 
Condition Unadjusted Means (Secs.) 
Standard 28.80 28.60 
Raie Change 23.44 23.50 
Direction Change 24.54 24.17 
Torque Change 29.53 30.05 
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Table 5 
The Average Time Scores for the Last Pretrial and 
First Posttrial for the Later 
Learning Condition 
rial 15 


Condition Trial 21 


Standard 31.7 31.2 
Control 32.4 31.9 
Direction Change 33.7 30.1 
Rate Change 32.6 31.4 


lorque Change 30.9 29.4 


the interpolated period. The results were 
significant at beyond the .01 'evel showing 
that altering display-control directional rela- 
tionship interfered with performance on the 
reversed task as compared with performance 
on the standard task. 

Effects of altering task components at dif- 
ferent learning stages. The previous analy- 
sis indicated that altering different task com- 
ponents during the initial acquisition of task- 
required skills produced differential and tran- 
sient effects. One further analysis was per- 
formed with the later learning condition. This 
analysis was based on the observation that 
the first posttrial after interpolation of the 
experimental tasks showed that the time 
scores were generally lower than the Iast pre- 
trial scores. A correlated ¢ test of the dif- 
ference score for all experimental conditions 
yielded a significant difference between trials 
15 and 21 beyond the .05 level. Apparently, 
the major effect of altering task components 
later in learning on this task is to produce a 
temporary decrement in performance on the 
retention of the standard task regardless of 
the task characteristic altered (see Table 5). 


Discussion 


With regard to the hypothesis of this study 
and the results obtained, the effects of alter- 
ing task components at different stages of 
learning are interpreted as follows: during the 
initial stages of learning on perceptual-motor 
tasks of this type, behavior is influenced by 
the “immediate 
vided by 


sensory information” pro- 
the display-control relationships. 
Depending on the characteristics of the task, 


responding to these changed display-control 


components may either facilitate or interfere 
temporarily with learning on the task. With 
the development of task proficiency, any 
change in the display-control relationships 
has no appreciable effect other than a very 
transient interference effect. This transient 
interference effect may indicate nothing more 
than the time required by the operator to 
perceive and adjust to the altered display- 
control relationship. In the task situation 
studied in this experiment, the display-con- 
trol changes were accomplished between trials 
without S’s knowledge. S had to begin a trial 
before becoming aware of any possible task 
changes. 

An attempt to suggest possible factors which 
contributed to the differential effects of alter- 
ing display-control relationships on the learn- 
ing process is centered on the nature of the 
task and its operational characteristics. The 
nature of this task was such that maintain- 
ing fairly constant rates of control movement 
resulted in higher time scores than rapid, 
ballistic movements. During the initial learn- 
ing trials, there is a marked tendency by S to 
respond extensively and rapidly to changes in 
the display. This aspect has been noted 
previously with perceptual-motor tasks (Kel- 
logg, 1946) and is also reflected in a measure 
of the amount of control movement recorded 
in the present study. The introduction of 
greater torque in the control system appar- 
ently acted to reduce. the extent of control 
movements. Thus, S might have learned 
that slower controlled movements resulted in 
greater ease in “getting and staying” on tar- 
get. On the other hand, increasing the rate 
relationship between amount of control move- 
ment to target indicator movement and re- 
versing the directional relationship apparently 
enhanced the initial tendency to ballistic 
movements and oscillation behavior. Sup- 
posedly, if the rate experimental task condi- 
tion had required a lower rate relationship 
between control and display, this would have 
tended to produce the same effects as increas- 
ing the amount of torque. One implication 
of this interpretation is the hypothesis that 
combining several variations in the task which 
tend to reduce the initial ‘‘overcontrolling” 
and ballistic movements should lead to more 
pronounced facilitation. On the other hand, 
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combining task variations that tend to en- 
hance this initial behavior should lead to more 
pronounced interference effects. Systematic 
combination of different task characteristics 
which compete in tending to dampen or en- 
hance this initial control movement behavior, 
and analysis of the effects on subsequent ac- 
quisition on the standard task may provide 
a more sensitive technique for determining 
the relative influence of different task charac- 
teristics on the acquisition and retention of 
task-required skills. This procedure may 
have potential research value when consider- 
ing that the selection of techniques for study- 
ing various facets of behavior efficiency is 
one of the more difficult design problems fac- 
ing the researcher in this area (Andrews & 
Ross, 1955). 

The results of the experiment permit the 
following two conclusions to be made: (a) 
during the early learning stage on a com- 
pensatory perceptual-motor task, the task-re- 
quired skills are more affected by altering 
task components than during the later learn- 
ing stage, and (0) the effects of altering dif- 
ferent task components on the acquisition of 
the task-required skills appear to be a func- 
tion of the operational characteristics of the 
task. 

Summary 


The problem was to determine the effects 
of manipulating different display-control re- 
lationships of a perceptual-motor task on 
the acquisition and retention of task-required 
skills, in early and later stages of learning. 

A preliminary experiment permitted the de- 
termination of the learning characteristics of 
different task components and the compara- 
bility in over-all performance achieved among 
these components. Ten Ss were assigned to 
one of eight tasks in which three display- 
control relationships were varied each in two 
ways. These tasks involved variations in dis- 
play-control directional, rate of change, and 
differential torque relationships. Each S per- 
formed on a task for 20 trials. Each trial 
lasted 1 min., with 25 secs. between trials. 
Time on target was the primary measure of 
performance. Results indicated significant 
practice effects for all tasks. All tasks were 
demonstrated to be comparable in difficulty. 


The main experiment involved selection of 
one task as the standard task with three ex- 
perimental tasks selected so that each had 
only one display-control characteristic differ- 
ent from the standard task. Nine conditions 
were investigated, and 12 Ss were assigned to 
each condition. The display-control charac- 
teristics were manipulated in the early and 
later learning stages on the standard task. 
Ss were divided into two groups, one for early 
and one for later learning. Each learning 
stage condition consisted of three experimen- 
tal groups and one control group. Another 
group of 12 Ss performed all trials on the 
standard task. A total of 35 trials of 1-min. 
duration and 25 secs. between trials was given 
to each S. For the early learning condition, 
each S practiced for five trials on the stand- 
ard task, then for five trials on one of the ex- 
perimental tasks or on the control condition. 
Then all Ss practiced on the standard task 
the remaining 25 trials. For the later learn- 
ing condition, all Ss practiced on the stand- 
ard task for 15 trials, then for five trials on 
either the experimental tasks or control con- 
dition, and the remaining 15 trials on the 
standard task. Time on target was the pri- 
mary measure of performance. 

Results showed that altering different dis- 
play-control relationships early in learning 
produced transient facilitative and interfer- 
ence effects on subsequent learning of the 
standard task. Changing these characteristics 
during later learning depressed the first post- 
trial scores with respect to the last pretrial. 
It was concluded that different display-con- 
trol relationships differentially facilitate or in- 
terfere with the learning process during the 
initial learning stage and that the nature of 
the effect is related to the operational charac- 
teristics of the task. 
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PERSONALITY OF THE ROUTE SALESMAN IN A 
BASIC FOOD INDUSTRY ' 


DAVID A. RODGERS 


University of California, Berkeley 


While much attention has been given to the 
selection of salesmen, relatively little has been 
given to their general personality character- 
istics (see Roe [1956] for a survey of rele- 
vant work). Most studies have used only one 
or two assessment instruments, usually ones 
highly focussed on the job situation itself. 
This paper reports an intensive personality 
study, using a variety of tests, of a carefully 
selected sample of the route salesmen in a 
national wholesale basic food company. Com- 
parisons were made between the salesmen’s 
personalities as seen by themselves, by their 
bosses, and by a psychologist, and between 
their personalities and their job requirements 
as seen by themselves and their bosses. The 
purpose of the study was to determine the 
personality characteristics of a typical group 
of route salesmen. 


Procedure 
Subjects. Two selling units in a food company of 
national scope were selected by the head of the sales 
division as being typical of the company selling op- 
eration. The sales manager of each unit selected six 
of his route salesmen to represent a cross section oj 
his unit and rank ordered them from good to poor 
as employees. These 12 salesmen constitute the Ss 
of the study. They sell a large line of related prod- 
ucts to retailers on regular routes. The work is 
highly competitive. It requires the Ss to make many 
decisions on their own about price and display and 
to adjust to many situations. The sales manager of 
each unit has almost complete job authority over the 
Ss under him. 

Role description inventory. It was desired to ob 
tain standardized and comparable descriptions of 
what the sales manager thought each S was like (the 
persona, or P), what the sales manager wanted each 
S to be like (the role demand, or RD), what each S 


1 The data for this study and many of the analy- 
ses were taken from the author’s unpublished doc- 
toral dissertation, Personality Correlates of Successful 
Role Behavior, Univer. of Chicago, 1953. The au- 
thor is indebted to Nejelski and Co., Inc. Manage- 
ment Counsels, for opportunity to collect the data, to 
D. L. Grummon, E. A. Haggard, and M. I. Stein for 
invaluable assistance in planning and executing the 
study, and to L. W. Porter for helpful comments on 
the manuscript 


thought. the sales manager wanted him to be like 
(the role concept, or RC), what each S thought he 
himself was like (the self concept, or SC), and what 
a psychologist thought each S was like on the basis 
of a battery of tests and interviews (the clinical pic- 
ture, or CP). For this purpose, a Q sort (Stephen 
son, 1953), designated QS, was developed. QS con 
sists of 108 statements of attitudes and behaviors 
expressive of 36 of Murray’s need-press variables 
(Murray, 1938). Murray’s variables were used as a 
basis for selecting the Q items to insure a wide rep- 
resentation of characteristics with a minimum of em 
phasis on any one aspect of personality or behavior 

The statements in QS were to be card sorted into 
a forced-frequency quasi-normal distribution from 
most characteristic or descriptive to least character 
istic or descriptive of the person or role dimension 
being described. The distribution used was 1-6-18 
29-29-18-6-1, slightly platykurtic, with ge equal to 
—0.29. 

Data collected. Each S was given the following 
tests: Wechsler-Bellevue, Form L; Rorschach; Th 
matic Apperception Test, 20 pictures for adult males; 
Group TAT, 3 selected pictures (Stouffer & Toby 
1951); Draw-a-Person; Draw-a-Salesman; 50 sen 
tence completion items selected from the Stein SCT; 
50 sentence completion items for salesmen, prepared 
for this study and focussed specifically on the selling 
situation; a structured interview dealing broadly with 
personal history and attitudes; and the RC and SC 
descriptions already mentioned, using QS. The CP 
descriptions of each S were based on this battery of 
tests minus RC and SC, which were not examined 
until after the CP descriptions had been completed 
The analyses and CP descriptions were made by the 
author, using conventional principles of interpreta 
tion. As previously indicated, P and RD descrip 
tions were also obtained for each S, using QS. Reli 
ability checks were provided by resortings of QS for 
RC and SC six months after the initial sortings and 
by resortings of QS for CP for each S after 11 inter 
vening CP sortings (Table 1). No reliability checks 
were made for P and RD 


Analyses and Results 


For each of the variables CP, SC, P, RC, and RD, 
the 12 relevant QS descriptions were intercorrelated 
and factor analyzed, using Thurstone’s centroid 
method. The loadings on the first centroids are 
shown in Table 2. The obtained centroid factors 
for CP, SC, P, and RC were algebraically rotated 
(Rodgers, 1957) to produce maximum correlation 
with the sales managers’ rank orderings (success rat 


ings) of the Ss. The rotations were made to deter 
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Table 1 


Scores of Subjects on Variables Indicated 


Subject 


Variable 1 ! ! J f 5B 
Boss Ranking for 


Good Employec¢ 


Good Salesman 


Buddy Ranking for 
Good Employee 
Good Salesman 
Good Friend 


Age in Years 
Years in Sales in 


Present Company 
Years of School 


Wechsler-Bellevue 
Performance IQ 111 102 
Verbal IQ 7 108 109 
Total 1Q 110 106 
Fisher Rigidity Index ; 3. - 35 j . 71 


Sort-Resort r of 
CP Description 64 82 18 7 ; .65 70 
SC Description 5 58 .30 74 A 55 5S A .60 an 
RC Description ; x i 37 .69 j 00 P 57 .60 


* Rank difference cor or ss ranking for good employee Groups were ranked separately 


Table 2 


Loadings on Indicated Factors 


Factor 
Ist Centroid Rotated 
Subject" : SC Cc , j Sc RC 


1A 
1B 
2A 
2B 81 
3A 

3B 

4A 

4B 

5A ; 
5B 60 
6A 

6B 68 


06 
02 
13 
41 
28 06 
16 26 
26 6 ‘ ote - - 28 10 
36 7 98 -. 17 
28 A5 _ — 36 
10 7 99 39 .06 
37 5. am .26 —. — 39 
19 78 99 28 30 — .28 .09 


st ~! 4st 
= NR WH & 


set 


x 


_ 


wun ow 
‘ nN 


nom 
uns 


— 


* Numbers 1 through 6 indicate success rankings assigned by bosses, 1A and 1B being best employees in Groups A and B, 
respectively 
b RD descriptions factored separately for Groups A and B 
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mine whether there were factor dimensions that dis- 
tinguished successful salesmen from unsuccessful ones. 
Obtained loadings are shown in Table 2. Relatively 
satisfactory discrimination was obtained for CP and 
P, correlations between the rotated-factor loadings 
and the success ratings being .61 and .92 respectively. 
Satisfactory separations were not obtained for SC 
or RC, correlations between the rotated-factor load- 
ings and the success ratings being .17 and .19, re- 
spectively. The failure to achieve satisfactory sepa- 
ration for SC and RC is of interest and should be 
noted. The factor rotation method maximizes chance 
as well as nonchance relationships between the suc- 
cess ratings and the several (four for SC and three 
for RC) sets of factor loadings making up the or- 
thogonal reference frames. Failure to achieve ap- 
preciable correspondence between the rotated-factor 
loadings and the success ratings, in spite of maximiz- 
ing such chance similarities, can be taken as convinc- 
ing evidence that little nonchance relationship exists. 

The factor saturations of the QS items were com- 
puted for the CP, SC, RC, and RD first centroids 
and for the CP and P factors correlating highest 
with the success ratings. Spearman’s formula for 
weighting (Spearman, 1927, Appendix XX) was used 
for computing the saturations. Since the factor load- 
ings of the P first centroid correlated .96 with the 
loadings on the rotated P factor, item saturations 
for the P first centroid were not computed. They 
would be almost identical with those for the rotated 
factor. 

From the saturations, the items most characteristic 
and least characteristic of each factor were deter- 
mined.2 According to these saturations, the boss 
sees the good salesman as success-oriented, deter- 
mined, and rather dominant, in contrast to the poor 
salesman, who is confused, complaining, and dis- 
tractible. Each of the salesmen, good and poor alike, 
tends to see himself as success-oriented, sociable, co- 
operative, energetic, and confident without being 
cocky. The CP description is somewhat at variance 
with both of these views. In terms of the tests, the 
salesmen all appear to be highly conforming, ma- 
terialistic, attention-desiring individuals who have 
few internalized standards of conduct and little abil- 
ity to form or maintain close interpersonal relation- 
ships. In terms of the tests, the successful salesmen 
are more dominant, vigorous, controlled, and self- 
satisfied than are the less successful ones, these dif- 
ferences being similar to those seen by the bosses. 
For all Ss, the job expectations are that they be 
success-oriented, determined, persistent, and rather 


2Tables A to F, listing the most characteristic 
items and the least characteristic items for each of 
the indicated factors, and Table G, summarizing the 
Rorschach scoring for each S according to Beck’s 
system, have been deposited with the American Docu- 
mentation Institute. Order Document No. 5887, 
from ADI Auxiliary Publications Project, Photo- 
duplication Service, Library of Congress, Washing- 
ton 25, D. C., remitting $1.25 for 35 mm. microfilm 
or $1.25 for 6 by 8 in. photocopies. Make checks 
payable to Chief, Photoduplication Service, Library 
of Congress 


Table 3 


Distributions of Fisher’s Index of Rigidity Scores 


Sales- 
men in 
Present 
Study 


Fisher’s Data 


Measure Normals Hysterics Paranoids 


Mean 46.2 24.9 44.3 44.1 
SD 14.4 11.8 17.2 13.6 
N 12 20 20 20 


dominant individuals and the salesmen have a fairly 
accurate understanding of these expectations. 

Fisher’s Index of Rigidity scores (Fisher: 1948, 
1950) were computed from the Rorschach protocols 
(see Footnote 2) and are shown in Table 3. A sum- 
mary of the relationships of various variables to the 
success ratings is shown in Table 1. 

Discussion 

The first centroid factor loadings for all 
variables except P are consistently high and 
positive (Table 2), indicating that the Ss are 
a homogeneous group with similar personali- 
ties, as might be expected of people in similar 
jobs. In spite of such similarity, however, 
the good salesmen are seen by their bosses as 
being quite different from the poor salesmen, 
the P first centroid being bipolar (Table 2) 
and the loadings on it correlating .86 with 
the success ratings, by the rank difference 
method. Although much alike in other re- 
spects, the Ss differ in terms of their job 
abilities and it is primarily in terms of these 
differences that they are seen by their bosses. 

As already noted, no significant differences 
related to job success were found in the SC 
and RC descriptions of this group. It might 
be possible to find such differences between a 
group like this and a group of Ss that were 
totally unsuited for the job, since all of the 
Ss in the present study were at least suffi- 
ciently competent to keep their jobs. The re- 
sults do suggest, however, that self and role 
descriptions on broadly based personality in- 
ventories may have much more limited utility 
for employee selection than do either descrip- 
tions on inventories narrowly focussed on the 
specific job requirements or other kinds of 
evaluations such as interviews by supervisors, 
who may be highly sensitized to those char- 
acteristics that are important for the par- 
ticular job. 
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Too few Ss are involved in the study to 
give clear-cut relationships between the suc- 
cess ratings and the variables of Table 1. 
There is some indication that age and length 
of time in sales with the company may be 
related to job success. As might be expected, 
the boss rankings for selling ability correspond 
closely to the success ratings. The buddy 
rankings for selling ability and for good em- 
ployee are positively but not perfectly corre- 
lated with the boss rankings, indicating that 
a person’s abilities as seen by his colleagues 
are not necessarily the same as those seen by 
his boss. 

According to Fisher’s Index of Rigidity, the 
Ss have markedly rigid personality structures 
(Table 3). All of the tests reveal what 
might be called personality impoverishment. 
However, much of the “impoverishment” 
seems well suited to help the salesman in his 
job. This is especially true of certain char- 
acteristics that the author feels are highly 
descriptive of all of the Ss studied. These 
characteristics were inferred from a general 
evaluation of the test and interview mate- 
rials and from observation of the salesmen on 
their routes. The characteristics are: 


1. Dependence on other people’s opinions 


and absence of own opinions. The Ss seemed 
to have few strongly-held opinions of their 
own. In Riesman’s terms, they were other- 
directed (Riesman, 1950) to the extreme. 
This seemed to help them adjust easily to a 
constantly and widely varying line of prod- 
ucts and to customers of widely varying be- 
liefs. They seemed skilled at probing for and 
discovering other people’s opinions and at 
using these opinions to establish quick rapport 
with new acquaintances. If they did not 
have someone else’s opinions to lean on, how- 
ever, they seemed uncomfortable and at loose 
ends. 

2. Cathection of the tangible and lack of 
interest in intangible values. Being interested 
in possessing material things themselves, they 
seemed better able to convince other people 
of the value of possessing the commodities 
they were selling. They also seemed less in- 
hibited by concepts of right and wrong or of 
proper and improper, and were correspond- 
ingly more willing to do what was pragmati- 


cally necessary to make a sale. They were 
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willing to put up with more than ordinary 
personal discomfort and inconvenience in or- 
der to attain monetary gain. 

3. Superficiality of relationships and basic 
distrust of people. The Ss were much con- 
cerned about maintaining good superficial re- 
lationships but seemed quite unlikely and un- 
able to form close or permanent attachments 
to people. They seemed basically to distrust 
people but to conceal the distrust behind a 
facade of congeniality. This characteristic 
seemed to help them develop great skill in 
appearing to be friendly and helpful without 
really becoming concerned about the other 
person’s welfare. They could then skillfully 
manipulate relationships to sell their product 
without worrying about whether the customer 
really needed the product or whether it was 
the best the customer could obtain for the 
price. Such an attitude is perhaps the es- 
sence of Fromm’s (1947) marketing person- 
ality. 

Such analyses suggest that a certain amount 
of “psychopathology,” provided it is of the 
right sort, may be beneficial or even essen- 
tial in some jobs, rather than harmful as is 
often supposed. 


Summary 


A group of 12 route salesmen selected to 
represent a cross section of the wholesale 
selling force of a large company in a basic 
food industry were given an extensive bat- 
tery of tests. From these, standard “clinical” 
descriptions of the salesmen were prepared. 
The salesmen described themselves and their 
job requirements, and their bosses described 
them and the job requirements. These vari- 
ous descriptions of the Ss and their jobs are 
compared and are related to the bosses’ rank- 
ings of the Ss as “good” to “poor” em- 
ployees. The personality characteristics com- 
mon to all of the salesmen and those dif- 
ferentiating the more successful from the 
less successful are identified. The way in 
which the salesmen’s personality character- 
istics adapt them for their job is discussed. 


Received September 26, 1958. 
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COMPARISON OF TWO STYLES OF LEADERSHIP 
IN SMALL GROUP DISCUSSION ' 


RICHARD H. PAGE ano ELLIOTT McGINNIES 


University of Maryland 


Although discussion group leadership may 
vary in many ways, a common distinction in- 
volves what is variously termed as “demo- 
cratic,” “non-directive,” “group-centered,” or 
“permissive” leadership as opposed to an 
“autocratic,” “directive,” or “leader-centered” 
style. Despite a vast amount of research in 
the area of leadership, it is not entirely clear 
just how directive a discussion leader should 
be for maximum effectiveness. In fact, it is 
apparent from several studies (Leavitt, 1951; 
Shaw, 1955) that a distinction must be made 
between efficiency of group performance and 
satisfaction of the members with the group, 
since the two are not always compatible. 

The weight of research findings dealing with 
leadership style seems to favor the democratic 
or group-centered approach over the authori- 
tarian or leader-centered variety. In the edu- 
cational setting, for example, a number of 
studies (Anderson & Brewer, 1954; Flanders, 
1951; Robbins, 1952) suggest that less domi- 
nating, student-supporting teacher behavior 
produces fewer hostile and aggressive re- 
sponses and more integrative group behavior 
than does more directive leadership. That 
group-centered leadership tends to result in 
greater “social-emotional growth” and insight 
is evidenced in several studies (Asch, 1951; 
Bovard, 1952; Faw, 1949; Gross, 1948). It 
has also been shown that the group-centered 
approach is superior in altering the percep- 
tions of members in the direction of a group 
norm (Bovard, 1951b), in producing greater 
communication of feeling among members 
(Bovard, 1952) and in stimulating group in- 
teraction (Bovard, 195la). Other studies 
(Hare, 1953; Preston & Heintz, 1949) have 
indicated that “participatory” leadership pro- 


' This research was supported by a special grant 
from the National Institute of Mental Health, United 
States Public Health Service, Department of Health, 
Education, and Welfare. Ss were obtained through 
the cooperation of the American Association of Uni- 
versity Women. Donald K. Pumroy served as dis- 
cussion leader for the groups 
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duces more change in privately held opinions 
and greater satisfaction with group products 
than does “supervisory” leadership. There is 
also evidence (Guthe, 1945; Levine & Butler, 
1952; Radke & Klisurich, 1947) that group 
discussion is more effective than lectures in 
changing the opinions of group members. 
Many of the advantages claimed for the 
group-centered style of leadership can prob- 
ably be attributed to the facilitating effect of 
this approach upon member participation. 
Group satisfaction and productivity seem to 
depend, however, upon the “quality” as well 
as the “quantity” of participation (Fouriezos, 
Hutt, & Guetzkow, 1950), and in this respect 
it is significant to note that democratic dis- 
cussion leadership also seems to improve the 
quality of group decisions (Maier & Solem, 
1952). 

Evidence for the superiority of a more di- 
rective or formal leadership role in some 
group situations is also available. Member 
satisfaction with decision-making in govern- 
ment and industry has been found to be posi- 
tively correlated with the degree of control 
exerted by the designated chairman (Berko- 
witz, 1953). When final grades were used as 
a criterion, the lecture method was judged 
superior to group discussion in classroom 
teaching (Asch, 1951; Husband, 1949). Di. 
rective leadership in therapy groups resulted 
in more therapeutic gains, higher satisfaction, 
and better attendance than did a nondirective 
approach (Evans, 1950). Finally, college 
students have been reported to prefer discus- 
sion classes which were directively led to a 
more permissive discussion, with only the 
poorer students performing better on ex- 
aminations under directive leadership (Wispe, 
1951). 

In a study rather closely related in pur- 
pose to the present one, Wischmeier (1955) 
observed the discussion behavior of college 
students exposed to both leader-centered and 
group-centered leadership at two separate 
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meetings. He found that over 75% of the 
group members were aware of the differences 
between the two leadership roles, and that all 
groups ranked the leader-centered role higher 
on the value of the leader’s contributions. 
However, the Ss rated the group-centered dis- 
cussions higher on such items as degree of 
personal involvement, warmth and _ friendli- 
ness of atmosphere, ease with which contribu- 
tions could be made, and cooperativeness of 
the discussion atmosphere. In _ discussing 
these results, Wischmeier states, ““This would 
suggest that the group-centered leader is not 
likely to receive much appreciation or recog- 
nition for his leadership services, although he 
is likely to be more ‘successful’ (in terms of 
involvement, cooperativeness, etc.) in leading 
his group.” 

We have attempted to discover whether two 
styles of discussion leadership would be dif- 
ferentially perceived by adult members of 
small discussion groups and to determine 
something of the nature of these perceptions. 
The two styles may be described broadly as 
directive, or leader-centered, and nondirec- 
tive, or group-centered. We asked the fol- 
lowing questions: 


1. What attributes are assigned by mem- 
bers of discussion groups to a leader playing 
directive and nondirective roles? 

2. Will the same discussion leader be per- 
ceived more favorably in a directive or in a 
nondirective role? 


3. Are judgments about directive and non- 
directive leadership related to amount of par- 
ticipation in discussion by those doing the 
rating? ; 

4. Will group opinions about the value of 
the discussion be determined by the type of 
discussion leadership experienced ? 


Procedure 


Subjects. Six groups ranging in size from 6 to 16 
female members were scheduled to meet once each 
for the purpose of viewing a mental health film, 
“The Feeling of Hostility,” and holding a subsequent 
25-minute discussion. The groups were assigned ran- 
domly, three each, to directive and nondirective lead 
ership conditions. Biographical information obtained 
from the participants showed the groups under the 
two conditions to be comparable in most respects 
Average age of all Ss was 39 years, 80% were mar- 
ried, over 90% had completed four or more years of 
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college, and family incomes averaged better than 
$7000 annually. A total of 65 women, representing 
upper-level groups both educationally and economi- 
cally, constituted our experimental population. Since 
these are fairly typical of the types of individuals 
who are active in educational programs concerned 
with mental health, it was felt that the selective na- 
ture of the sample was appropriate for the present 
study. 

Discussion leadership. A colleague, trained in role- 
playing procedures, served as discussion leader for 
all of the groups. In three of the discussion sessions 
he assumed what we have broadly termed a “direc- 
tive” role, while in the remaining three he adopted 
a relatively “nondirective’ approach. Briefly de- 
fined, the directive role required that the discussion 
leader serve as a professional “expert” for the group, 
interpreting and explaining points that were made 
in the film, responding directly to questions from 
the group, and venturing his own opinions whenever 
an appropriate occasion arose. In the nondirective 
situation, the leader refrained from interpreting the 
film, reflected questions and comments from. indi- 
viduals back to, the group, and limited expression of 
his own viewpoint as much as possible. The ef 
fectiveness with which these contrasting roles were 
played was determined from examination of ques- 
tionnaires completed by the Ss at the conclusion of 
the discussions and will be described later. 

Experimental conditions. Sessions were held un- 
der informal conditions in the homes of the par- 
ticipants, and, with their knowledge and permission, 
the discussions were tape-recorded. Although the Ss 
were allowed to remain anonymous, identification by 
symbols of each discussion participant with his com 
ments was accomplished by a procedure described in 
detail elsewhere (McGinnies, 1956). At the begin- 
ning of each meeting, all Ss filled out a short bio- 
graphical inventory. They then viewed the film and 
held a discussion of it. Following the discussion, the 
group members answered a questionnaire designed to 
obtain their opinions of the film, the discussion, the 
discussion leader, and the value of the total experi 
ence. 

Evaluation techniques. The data from which the 
hypotheses of the experiment were to be evaluated 
were obtained from the questionnaire administered 
after the discussion. The first part of this form con- 
sisted of a list of 20 pairs of polar adjectives, each 
pair defining the limits of a 13-point rating scale 
The Ss were asked to place a checkmark against 
that position of each scale which they felt best char- 
acterized the discussion leader’s behavior. In 19 of 
the 20 adjective-pairs, one of the terms was clearly 
more complimentary or favorable to the leader than 
the other, the direction of favorable response having 
been determined through pretesting with a group of 
college students. One of the adjective-pairs was 
ambiguous but was included because it seemed rele 
vant to the problem. The adjectives selected were 


based upon a list devised by Molnar (1955) for 


evaluating discussion leadership 
Following administration of the adjective check 
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list, the group members responded to 14 questions 
designed to assess their perceptions of the film, the 
discussion, the discussion leadership, and the over- 
all proceedings. 

Measures of leader behavior. In order to validate 
the actual differences in the two roles assumed by 
the discussion leader, the proportion of discussion 
time consumed by the leader under the two condi- 
tions was calculated for each of three equal divi- 
sions of the discussion period. From Table 1, it is 
clear that the leader used more discussion time when 
in a directive position, although the group members 
observed under this condition tended to consume 
more time after the first third of the discussion. 
Further analysis of the leader’s behavior showed 
that his greater dominance of the discussion in the 
directive role resulted from his making longer rather 
than more frequent comments. 

Inspection of the leader’s statements suggested that 
they might be categorized meaningfully to reflect the 
directive or nondirective roles and, hence, to further 
validate the effectiveness with which these two con- 
ditions were established. Table 2 contains the break- 
downs of the leader’s comments under the two ex- 
perimental conditions according to these categories. 

It is apparent from this table that in the directive 
role, the discussion leader expressed a much greater 
proportion of opinions or interpretive statements and 
related information somewhat more frequently than 
in the nondirective role. In the nondirective setting, 
the leader engaged in considerably more reflecting or 
rephrasing of group comments, agreed more with re- 
marks made by group members, and directed more 
questions at the group. The coding of leader com- 
ments according to the seven categories was done in- 
dependently by two judges, with a reliability coeffi- 
cient of .90. 

It may be concluded from the data in Tables 1 and 
2 that the discussion leader did, in fact, adopt two 
discriminable roles in handling the different groups, 


Table 1 


Proportion of Discussion Time Consumed by Leader 
Time Periods 

_ —— Total 
Discussion 


Groups* I II Ill 


ND (N = a m6 lS 10 
ND (N > ae * ie & .23 
ND (VN = ; .. aa 18 


ND ‘ an 4 A7 
D (N : OA. 64 


D (N a 38 42 
ap 53 


b 63 49 53 


®* The symbol ND means nondirective, while D indicates 
directive leadership. 
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Table 2 


Percentage of Leader Behavior Occurring in 
Various Descriptive Categories 


ND 


= 
~~ 


wn 


Opinion or Interpretation 07 
Reflecting and Rephrasing 21 
Relating Information 12 


onw~ So 
Unnw uw 


Questions 15 
Agreements 43 
Disagreements 00 


Go = 
= 


Humor 02 


and that our initial definitions of the directive and 
nondirective approaches were effectively implemented 
in the experimental situations. 

Discussants’ perceptions of the leader. In order to 
score responses to the adjective checklist, ordinal 
numbers were assigned to the 13 positions on each 
of the 20 scales. In every case, the value 13 indi- 
cated the more “favorable” end of the scale, while 
a score of 1 marked the less favorable extreme. The 
grand mean for each adjective scale was calculated 
and was used to dichotomize that variable for pur- 
poses of comparing score distributions under the two 
leadership conditions. In other words, a given S’s 
rating was classified as falling either above or below 
the mean rating of all Ss for that item. Ss who 
were above the grand item means on more than half 
of the adjective scales were classified as “more fa- 
vorable” in their over-all evaluation of the leader, 
while Ss scoring below the grand means on half or 
more than half of the scales were classified as “less 
favorable” in their evaluations. There was a uni- 
form tendency in all of the groups for the leader to 
be favorably rated, so that the differences indicated 
the relative extent to which the leader was favored 
by Ss under the two conditions. 

When playing a directive role, the discussion leader 
received more favorable ratings by 22 group mem- 
bers and less favorable ratings by 13 participants. 
In the nondirective role, on the other hand, he was 
rated more favorably by 8 individuals and less fa- 
vorably by 22 persons. These differences when tested 
by chi square were significant at the .01 level. Ss 
experiencing directive discussion leadership, therefore, 
were significantly more favorable in their over-all 
evaluation of the discussion leader than were Ss un- 
der nondirective leadership. 

Of the 20 adjective pairs, the directive leader re- 
ceived a more favorable rating in 13, while the non- 
directive leader received more complimentary ap- 
praisal in only 7. Contingency tables were con- 
structed for each of the adjective pairs, and the 
adjectives favorable to each of the two types of 
leadership are listed in Table 3 in order of the values 
of the chi squares that were obtained. Those adjec- 
tives that discriminated between the two leadership 
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Table 3 


Descriptive Adjectives Ranked According to Their Discriminative Power 


Directive Leadership 

Adjective 
**interesting 
**fr 


(uninteresting) 
ank (evasive) 
**satisfying (disappointing 
*purposeful (aimless) 
*enlightening (unenlightening) 
*industrious (lazy) 
*persuasive (unconvincing) 
penetrating 
helpful 


effective 


(superficial) 
(hindering) 
(ineffective) 
friendly 
tactful 
considerate 


(unfriendly 
(tactless) 
(selfish 


conditions at the .01 and .05 significance levels are 
indicated by a double or single asterisk respectively. 

Perception of the leader and participation. Since 
our record of sequence of comments by the discus- 
sion group members enabled us to identify each in- 
dividual with his remarks, we were able to dichoto- 
mize the leader ratings by the extent to which the 
raters had participated in discussion. When the Ss 
under each leadership condition were classified ac- 
cording to whether they were above or below the 
median of those groups with respect to number of 
comments (high and low participators, respectively) 
and whether their ratings of the leader were above 
or below the grand item means on over half of the 
adjective scales, the contingencies shown in Table 4 
resulted. 

While the results shown in Table 4 are not signifi- 
cant when the two experimental conditions are con- 
sidered separately, analysis of ratings by the low 
participators only showed them to be significantly 
more satisfied with directive discussion leadership 
than with nondirective leadership (p< .01). High 


Table 


Nondirective Leadership 


Rank Adjective 
*permissive (restrictive) 

open-minded (opinionated) 
reserved (forward) 
cautious (rash) 
reasonable (stubborn 
practical (idealistic) 


modest (arrogant) 


participators tended to rate the leader less favorably 
under both leadership conditions. 

Evaluation of the discussions. Responses to the 
14 items which followed the adjective checklist were 
uniformly favorable, and for only one item was 
there a significant difference between the directive 
and nondirective groups. When responses to the 
question, “To what extent was the leader responsible 
for important contributions to the discussion?” were 
dichotomized into “very much” and “somewhat or 
little,” the directively led discussions received a sig 
nificantly (p< .05) greater proportion of ratings in 
the “very much” category than did the nondirectively 
led groups 

This result provides further evidence that the group 
members were aware of certain salient features of 
both directive and nondirective leadership as these 
roles are ordinarily defined. The remaining 13 items, 
on which no significant differences were obtained, 
dealt with such matters as the adequacy of the film, 
the extent to which important issues were raised in 
the discussion, how relaxed the S was during the dis 


i 


Frequency of More Favorable and Less Favorable Evaluations of the Leader for High and Low 
Participators Under Each Leadership Condition 


Directive 


More 
Favorable 


High Participators 


Low Participator 


Nondirective 


Less 
Favorable 


More 
Favorable 


Less 
Favorable 
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cussion, and the satisfaction of the S with the dis- 
cussion. It is perhaps worth noting that 6 of the 31 
Ss under directive leadership expressed dissatisfaction 
with the discussion, while none of the nondirective 
group members indicated disapproval. 

Since the film dealt with a topic of common in- 
terest, and the experimenters were, in a sense, pro- 
viding an educational program for the participants 
in the experiment, it is not surprising that generally 
favorable reactions to the discussion and the discus- 
sion leader were obtained regardless of the permis- 
siveness of the leader’s role. 


Discussion 


The results of the experiment indicate that 
a directive approach by a discussion leader is 
favored by members of sophisticated adult 
discussion groups. While this preference is 
not revealed in judgments of the participants 
about the discussions, it is reflected in the de- 
gree to which they assign favorable ratings to 
the discussion leader. Examination of the 
group members’ opinions of the leader ac- 
cording to the extent of participation of each 
member in the discussion suggests an explana- 
tion for the over-all group judgment. The 
more favorable opinions of the directive dis- 
cussion leader appear to have come princi- 
pally from those Ss who were classified as 
“low” participators. Inspection of Table 4 
shows that the high participators were about 
equally divided in the extent to which they 
rendered favorable judgments of the leader, 
while the low participators account for the 
significantly more favorable evaluation of 
leadership elicited under directive conditions. 
The distinction between high and low par- 
ticipation in accounting for member satisfac- 
tions and the effects of leadership style has 
also proved to be important in a number of 
other studies (Deignan, 1956; Porter, 1955). 
One might conjecture that the less aggressive 
members of a discussion group are more de- 
pendent upon leadership and, therefore, prefer 
a leader who answers this need. The fact 
that even the more active participants were 
almost 2 to 1 in the direction of less favorable 
response to the leader’s playing a nondirec- 
tive role, however, confirms the finding that 
directive leadership is rated more favorably 
by discussion group members in general. 

We found no confirmation for previous re- 


ports that the nondirective, or group-centered, 
leader is likely to be more successful in lead- 
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ing his group, when success is defined in 
terms of group reaction to the discussion. 
Our Ss were equally well satisfied with the 
discussions under both leadership conditions, 
but we are unable to say, of course, whether 
this would be true with different types of 
groups discussing different topics. There is 
also the possibility that the personal charac- 
teristics of the discussion leader contributed 
to the present over-all pattern of favorable 
response. In cases where the leader’s prefer- 
ences and skills favor either the directive or 
the nondirective role, different results might 
be expected. Since our data indicate that the 
leader played both roles effectively, this con- 
tingency probably does not apply to the pres- 
ent findings. 


Summary 


Three small groups of adult Ss viewed and 
discussed a motion picture film under direc- 
tive discussion leadership, while three addi- 
tional groups followed the same procedure 
under nondirective leadership. Following the 
discussion, the Ss rated the leader in terms 
of 20 adjective pairs, each of which defined 
favorable and unfavorable ends of a con- 
tinuum. They also answered questions rela- 
tive to the value of the discussion. 

In the groups in which he played a direc- 
tive role, the leader received significantly 
more favorable ratings than in those groups 
where he employed a nondirective approach. 
The directive leader was rated as significantly 
more interesting, frank, satisfying, purpose- 
ful, enlightening, industrious, and persuasive, 
and significantly less permissive, than the 
nondirective leader. 

When the group members were classified as 
high or low participators, it was found that 
the low participators were distinctly more fa- 
vorable to directive than to nondirective lead- 
ership. The high participators did not re- 
act in significantly different fashion to the 
two leadership conditions, although they also 
tended to be less favorably disposed to the 
nondirective leader. 

Judgments about the discussion itself were 
uniformly favorable in all of the groups and 
were not related to the type of leadership im 
posed. 


Received September 24, 1958 
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A STUDY OF THE VALIDITY OF THE SALES 
COMPREHENSION TEST AND SALES MOTI-- 


VATION INVENTORY 


IN DIFFERENTIAT- 


ING HIGH AND LOW PRODUCTION IN 
LIFE INSURANCE SELLING 
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Predicting success in selling continues to be 
a prime concern of many applied psycholo- 
gists in industry. Considerable time, effort, 
and money are spent by business and indus- 
try in an effort to improve procedures and 
techniques in selecting competent sales per- 
sonnel (Super, 1949). This is especially true 
in the life insurance sales field which has been 
supporting a group of psychologists to work 
on this and related problems. 

Husband (1949) provided an excellent re- 
view of the literature on the selection of sales 
personnel up to that date, and more recently 
Austin (1954) reviewed research pertaining to 


sales personnel selection. Basically, published 
research indicates that there is considerable 
room for improvement in tools used for sales 
personnel selection. 

A recent research report (Kennedy, 1958) 
tends to negate an idea developing for some 


time: “General tests” of sales potential have 
little or no value; specific tests for specific 
sales situations are required. The two instru- 
ments used in the present study are “general” 
rather than specific. 

In 1953 Bruce published two instruments 
designed to aid in selection of sales Personnel, 
the Sales Comprehension Test and the Sales 
Motivation Inventory. A validation article on 
the first instrument appeared in this journal 
the following year (Bruce, 1954a). 

The purpose of the present investigation is 
to determine if a relationship exists between 
these two instruments and achievement in life 
insurance selling. 

Procedure 
Tests 
The Sales Motivation Inventory (Bruce, 1954b) is 


a 75-item multiple choice preference form. It cor- 
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ielates relatively highly with the Sales keys of the 
Strong Vocational Interest Blank for Men and with 
the persuasive score of the Kuder Preference Record, 
Vocational. Supplementary normative data have ap 
peared in the literature (Bruce: 1956e; 1956f; Mur- 
ray & Bruce, 1957b), but not validation studies 

The Sales Comprehension Test (Bruce: 1953a; 
1957a) is a derivative of the test of Sales Aptitude 
(Principles of Selling) Form A (Bruce, 1958; Bruos, 
1953). It contains 30 multiple choice items consist- 
ing of concepts and sales situations. A number of 
research studies and supplementary data publications 
have appeared in the literature relative to the Sales 
Comprehension Test (Bass, 1957; Bruce: 1956a; 
1956b; 1956c; 1956d; 1956g; 1957b; Bruce & Frie- 
sen, 1956; Hecht & Bruce: 1957; 1958; Murray & 
Bruce, 1957a) and the precursor instrument (Bruce, 
1954b;. Gray & Rosen, 1956; Harless & Bruce, 1957; 
Speer, 1957). 


Subjects 


The Ss of the investigation were 60 ordinary life 
insurance salesmen chosen at random from among 
volunteers in companies licensed to operate in the 
state of Nebraska. This group was dichotomized at 
the $400,000 production mark into “successful” and 
“unsuccessful” groups, placing 39 men in the former 
group and 21 in the latter. These men represent 17 
life insurance companies. The territories of 52 of 
these men were in Omaha and Lincoln, while the re- 
mainder covered small towns. No industrial or debit 
life insurance salesmen were included in the sample 

All Ss worked full life 
They had very limited or no supervisory responsi 


Each 


time in insurance selling 


bility. man had at least one year of experi- 
ence 

The range of insurance sold in the criterion year 
was from $168,000 to $2,479,000 


the 


The mean age of 
4? 
the mean age of the “unsuccessful” group 


years is 
Length 
of life insurance sales experience averages 7.3 year 


“successful” group was 33.8, while 


for the former group, and 9.6 years for the latter 
Mean life the 


calendar year 1956 was $696,000 for the “successful” 


paid for insurance production for 


group, and $306,000 for the “unsuccessful” group 





Sales Comprehension Test and Sales Motivation Inventory 


The Criterion 
The measure of performance used in this study 


was the paid-for life insurance production for the 
calendar year 1956. 


Results 


Group comparisons of Sales Motivation In- 
ventory scores, Sales Comprehension Test 
scores, and a combination of these two scores 
were made to determine if there were signifi- 
cant differences between the “successful” and 
“unsuccessful” life insurance salesman. 

The ¢ technique was employed to determine 
significance of difference between group mean 
scores, utilizing where applicable a correction 
formula that takes into account the hetero- 
geneity of variance (Guilford, 1956). Deter- 
mination of the existence of common popula- 
tion variance was determined by applying the 
F test. 


Sales Motivation Inventory 


Scores for “successful” salesmen range from 
61 to 12 on the Sales Motivation Inventory, 
and for the “unsuccessful” group, the range 
on this instrument was 57 to —13. The 
means, respectively, were 35.54 and 25.20, 
and SDs were 11.70 and 18.10. 

The resulting F of 2.39 is .01 short of the 
.02 level of confidence, and the ¢ of 2.63 is 
significant at the .05 level (2.07 required). 

These findings suggest that the Sales Moti- 
vation Inventory is capable of differentiating 
the more competent life insurance salesmen 
from the less competent life insurance sales- 
men in the geographical area covered, and un- 
der the circumstances of this study. 


Sales Comprehension Test 

On the Sales Comprehension Test, scores 
for the “successful” group ranged from 57 to 
6, while 52 and 7 were the upper and lower 
limits of scores for the “unsuccessful” sales- 
men. The mean and SD in the former group 
were 32.23 and 11.15, and in the latter group 
28.34 was the mean and 13.5 was the SD. 

The F here proved to be 1.47, short of sig- 
nificance at the .0S level. The required ¢ of 
2.00 for significance at the .05 level was not 
obtained. 

These findings suggest that, used alone, the 
Sales Comprehension Test is not a valid dif- 
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ferentiator of 
selling. 


competence in life insurance 


Sales Motivation Inventory and Sales Com- 
prehension Test Combined 


Part of the initial plan of study was to 
combine the two predictors for purposes of 
further analysis. The significant difference 
shown by the Sales Motivation Inventory and 
the direction, though lack of significance, of 
the Sales Comprehension Test scores rein- 
forced the soundness of this approach with a 
suggestion of potential differentiation at a 
statistically significant level. 

Composite scores for each member of the 
total population were obtained by weighting 
the two tests equally. This was accomplished 
by weighting raw scores in inverse proportion 
to SDs of the respective tests. These com- 
posite scores ranged from 130 to 38 in the 
“successful” group, and 111 to 15 in the “un- 
successful” group. Means and SDs for the 
“successful” and the “unsuccessful” 
were respectively 83.70 and 67.60. 
spective SDs were 22.16 and 20.17. 

In this situation, a ¢ of 2.66 is required for 
significance at the .01 level. The obtained ¢ 
of 2.73 exceeds this, indicating that there is 
less than one chance in 100 that the means of 
these samples are not significantly different. 


groups 
The re- 


Summary and Conclusions 


Sixty ordinary life insurance agent volun- 
teers were obtained from 17 companies op- 
erating in Nebraska. All were experienced 
life insurance salesmen. The population was 
dichotomized unequally into “successful” and 
“unsuccessful” groups on the basis of insur- 
ance sold during the previous calendar year. 
All men completed the Sales Motivation In- 
ventory and Sales Comprehension Test. 

For both groups, ¢ and F tests were com- 
puted, and a ¢ test was applied to a com- 
bined test score. 

Status validity in this situation was shown 
for the Sales Motivation Inventory to the ex- 
tent of a ¢ significant at the .02 level and an 
F significant at approximately the .05 level 
The Sales Comprehension Test failed to show 
validity at accepted significance levels. 

Scores of the two instruments, when com- 
bined, yielded a ¢ significant at the .01 level. 
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The applicability of these findings to life 
insurance selling in general is questionable in 
the absence of further research because of 
the restricted geographical distribution of the 
population employed. 

Adequacy of criteria is a traditional re- 
search problem, and one that merits ques- 
tioning in this study. Reliability might be 
obtained by correlating the criterion year’s 
production with previous or subsequent years. 
Positive relationship would enhance confi- 
dence in the results. 

Equivalence of territories appears to be a 
reasonable assumption, but is not known as a 
fact. Subsequent studies might well control 
this factor more adequately. There are also 
the names of the companies in which the men 
sell. These may have advantages or disad- 
vantages worthy of control in order to better 
equate populations studied. 

A single criterion was employed in this 
study. Greater validity and reliability are 
potentially available through multiple criteria 
when evaluating predictors for complex jobs 
such as that of salesmen. A larger N is cer- 
tainly to be sought in subsequent studies. 

The results of this study, in spite of limita- 
tions of design and content, suggest possible 
value for the Sales Motivation Inventory and 
Sales Comprehension Test in differentiating 
levels of success among life insurance agents. 
The present method of test validation used 
here merits checking in other geographical 
locations to help determine the status validity 
of the two tests, and the follow-up method is 
to be encouraged to aid in determining pre- 
dictive validity of the Sales Motivation In- 
ventory and Sales Comprehension Test in life 
insurance selling. At least one such study 
with the Sales Comprehension Test involving 
9000 job applicants is now in progress. 


Received September 25, 1958. 
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THE EFFECT OF A SUBLIMINAL FOOD STIMULUS 
ON VERBAL RESPONSES ' 


DONN BYRNE ? 


San Francisco State College 


Since the autumn of 1957, subliminal stimu- 
lation has been the object of considerable 
public attention. The furor was instigated 
by the widely publicized Vicary (Brooks, 
1958) study in which sales were increased 
by the presentation of the phrases “Eat Pop- 
corn” and “Drink Coca-Coia” at 1/3000 of 
a second on a movie screen before an unsus- 
pecting audience. Subsequent public fears 
and pronouncements have far out-distanced 
the empirical data. 


Problem 


The concept of subliminal perception is an 
old and familiar one to psychology. It has 
long been known that behavior may be af- 
fected by stimuli of which the individual is 
not verbally aware. Beginning in the nine- 
teenth century, laboratory evidence has ac- 
cumulated to reveal greater than chance ac- 
curacy in the discrimination of visual, audi- 
tory, and olfactory stimuli rendered subliminal 
by distance (Sidis, 1898; Stroh, Shaw, & 
Washburn, 1908), low intensity of the stimu- 
lus (Coyne, King, Zubin, & Landis, 1943; 
Hollingworth, 1913; Laird, 1932; Miller, 
1939; Williams, 1938), low intensity of sur- 
rounding illumination (Baker, 1937), high 
intensity of surrounding illumination (King, 
Landis, & Zubin, 1944), and lack of atten- 
tion (Collier, 1940). In addition, research 
dealing with perceptual defense and learning 
without awareness involve response to stimuli 
lying below the Ss’ conscious thresholds. Lit- 
erature in these research areas has been re- 


1 Funds for this study were provided by the Insti- 
tute for Communication Research of Stanford Uni 
versity. 

* The author wishes to express his thanks to W. 
Schramm for his interest and support and to his 
colleagues R. Haber, L. Petrinovich, H. Robinson, 
M. Keston, and D. Shannon for their cooperation. 
The work of the research assistants F. Burkart, W. 
Hall, R. Kerr, F. Kopache, D. Minor, J. Murphy, 
K. Neuberger, J. Palmer, S. Perry, B. Powell, N. 
Skotdal, R. Starrett, T. Stephens, and E. Sweeney 
is also gratefully acknowledged. 
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viewed by Adams (1957), Lazarus and Mc- 
Cleary (1951), and McConnell, Cutler, and 
McNeil (1958). 

None of these laboratory experiments has 
given rise to public concern. However, many 
persons have recently become frightened by 
the idea that subliminal stimuli may be ac- 
tively manipulated in an effort to influence 
human behavior in some predetermined man- 
ner. Fears have centered in the possible un- 
scrupulous and unethical use of the tech- 
nique by advertisers and politicians. Visions 
of 1984 and Brave New World have been met 
by reassurances that no one can be made to 
do something personally unacceptable, that 
the stimulus can only act as a reminder for 
an existing desire, etc. What seems most 
clear is that extensive research is needed. 

The present investigation represents an at- 
tempt to isolate and study a few of the pos- 
sible variables. With a food word as the 
subliminal stimulus, four hypotheses were 
formulated: (a) verbal references to the 
stimulus word are increased; (6) in a choice 
situation, the stimulus object is preferred; 
(c) subjectively perceived hunger is greater; 
and (d) each of these effects is greater under 
conditions of high physiological hunger drive. 


Method 


The Ss were 105 (45 male, 60 female) students in 
four required freshman psychology classes. The ex- 
perimental and control groups each consisted of one 
11:00 a.m. and one 1:00 P.M. class 

4 16-minute movie, Controlling Behavior Through 
Reinforcement, was shown to each group. For the 
experimental Ss only, the word “beef” was super- 
imposed on the screen every seven seconds in flashes 
of 1/200 of a second duration.* 


3 An Eastman-Signet Slide Projector (Model I, 
500 watts, 35 mm.) was used, with a five-inch 3.5 


Kodak Ektanon lens. A new automatic shutter sys- 
tem was designed and constructed by James F. Lee, 
94 Willow Rd., Menlo Park, California. The shutter 
system, itself, consisted of a Compur Rapid /2.2 
shutter with a maximum opening of 63/64 inches 
All of the apparatus was placed in a soundproof 
projection box. 
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Immediately following the movie, two student Es 
entered the room and asked permission to administer 
a “Health Inventory.”’ The Ss rated their hunger on 
a five-point scale (not at all hungry to very hun 
gry), responded to a brief sentence completion and 
word-association test, checked their sandwich pref- 
crence (tuna, hamburger, cheese, steak, or roast 
beet), and indicated the time at which their last 
meal was eaten. Identical items concerning fatigue 
and smoking were included as buffers. Afterward, 
the Ss were asked their reactions to the movie and 
whether they noticed anything unusual about it 


Results 


Of the original 108 Ss who took part in the 
experiment, two saw the word “beer” and one 
saw “beef” flashed on the screen, so they 
were eliminated from the sample. The re- 
maining Ss gave no evidence of having per- 
ceived the stimulus. 

First hypothesis. The responses to the sen- 
tence completion and word association tests 
did not contain a sufficient number of beef or 
meat references to yield scorable categories. 
Subliminal stimulation did not increase ver- 
bal references to the stimulus word. 

Second hypothesis. A higher proportion of 
the experimental than of the control group 
(.37 and .28) chose roast beef in preference 
to the other four sandwich types. However, 
this difference was not statistically significant 
(x° = 1.10, P > .05). The difference almost 
reached statistical significance with the fe- 
male Ss (x? = 3.70, P < .10) but not at all 
for the male Ss (x? = .02, P > .05). The sex 
difference in choosing the roast beef sand- 
wich in the control versus the experimental 
groups was a significant one (x? = 8.46, P < 
01). The subliminal presentation of the 
word “beef” did not influence food prefer- 
ences as measured by a paper and _ pencil 
device. 

Third hypothesis. The experimental Ss 
rated themselves hungrier than did the con- 
trol Ss to a statistically significant degree 
(F = 11.00, df = 1/101, P < .01) as tested 
by analysis of variance. An attempt was 
made to control the physiological hunger 
state by using each group at the same times 
of day. However, there were group differ- 
ences in hours of food deprivation, and this 
latter variable was also significantly related 
to the hunger ratings (r = .21, P< .02). 
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Table 1 


Comparison of Experimental and Control Group 
Differences on Hunger Self-Ratings and 
on Hours of Food Deprivation 


Hunger Scale 
Ratings 
Group N SD 


Experimental 58 
Control 47 


Therefore, the hunger ratings of the experi- 
mental and control groups were compared in 
another analysis of variance, with hours of 
food deprivation controlled by the covariance 
technique. Group differences were still sig- 
nificant (F = 10.96, df= 1/102, P< .01). 
No significant sex differences were found on 
this variable. The means and standard de- 
viations are shown in Table 1. Since the two 
groups did not differ significantly in their 
self-ratings of fatigue or desire for a cigarette, 
it was concluded that the subliminal food 
stimulus increased subjective hunger, as meas- 
ured by a self-rating hunger scale. 

Fourth hypothesis. Food deprivation of 
200 minutes was a convenient midpoint to 
divide Ss into high and low drive state groups. 
The experimental and control groups were 
compared on their sandwich preferences with 
the high and low drive subgroups analyzed 
separately. In neither high drive (x? = .24, 
P > .05) nor low drive (yx? = .16, P > .05) 
conditions was there a significant difference 
in sandwich preferences. 


Table 2 


Comparison of Experimental and Control Group Differ 
ences on Hunger Self-Ratings in High 
and Low Drive Conditions 


Hunger Sca!e 
Ratings 


Group 


Experimental-High Drive 
Experimental-Low Drive 
Control-High Drive 
Control-Low Drive 
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The differential effect of drive state on the 
hunger ratings for the experimental and con- 
trol groups (data shown in Table 2) was 
tested by a two-way analysis of variance 
using the Walker and Lev approximation 
(1953, pp. 381-382). In addition to the 
group differences reported above, the high 
drive Ss rated themselves hungrier than the 
low drive Ss (F = 22.50, df = 1/101, P< 
.001), but the interaction effects were not 
significant (F = 1.00, df = 1/101, P > .05). 
Thus, high drive state was not found to be 
a necessary condition for influence by a sub- 
liminal stimulus. 


Discussion 


In general, the results of this investigation 
suggest that it is possible to affect some as- 
pects of human behavior through the use of 
stimuli presented in exposures too rapid for 
conscious perception. The findings also sug- 
gest, not too surprisingly, that any com- 
mercial application of subliminal stimulation 
must take more variables into account than 
simply the chosen stimulus and the desired 
response. 

Even though the projective measures of 
response did not reflect any stimulus influ- 
ence, it is possible that more extensive and 
varied projective devices would have yielded 
the expected results. 

The failure of the experimental Ss_ to 
choose the roast beef sandwich significantly 
more than the control Ss suggests the fol- 
lowing explanation. In asking for sandwich 
preferences, one is dealing with long estab- 
lished habit patterns which have been rein- 
forced on countless occasions. It might be 
hypothesized, though, that subliminal stimu- 
lation could determine preferences between 
equally familiar and desirable alternatives 
(e.g.—two previously unknown political can- 
didates) or between items with minimal 
differentiating qualities (e.g—two popular 
brands of toothpaste). The finding of sig- 
nificant sex differences in susceptibility to 
subliminal influence should be explored fur- 
ther. 

Since the subliminal food stimulus did in- 
crease subjectively felt hunger, it seems rea- 
sonable to hypothesize that appropriate stimu- 


lation could arouse thirst, fear, hate, anxiety, 
sexual desire, etc. Should drive arousal prove 
to be the major effect of subliminal stimula- 
tion, it is possible that Vicary’s findings 
(Brooks, 1958) resulted from the evoking of 
hunger and thirst drives; popcorn and Coca- 
Cola were bought simply because they were 
available in the lobby. It would be interest- 
ing to learn whether the sale of other soft 
drinks and of candy bars also increased. 

Finally, the idea that a drive must be pres- 
ent for subliminal stimulation to be effective 
was not supported. Rather, it may be pos- 
sible to use this method to create a need 
where it is absent. 


Summary 


This experiment was undertaken in order 
to test four hypotheses involving the effect of 
subliminal stimulation on human _ behavior. 
The experimental group saw a classroom 
movie with the word “beef” superimposed in 
flashes of 1/200 of a second every seven 
seconds; the control group just saw the 
movie. It was found that, compared to the 
control Ss, the experimental Ss (a) did not 
show increased verbal references to the stimu 
lus word; () did not choose the stimulus ob- 
ject in a multiple choice situation (though 
sex differences were significant); but (c) did 
rate themselves significantly more hungry. It 
was also found that hours of food deprivation 
did not influence any of these relationships. 
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PSYCHOLOGICAL ADJUSTMENT AND THE WORKER 
ROLE: 


AN ANALYSIS OF OCCUPATIONAL DIFFERENCES ' 


LAWRENCE G. COREY 


Industrial Relations Center, University of Chicago 


In recent years, the role of worker has taken 
on added significance, especially in the area 
of aging and retirement. Because, in the 
past, it occupied by necessity so much of his 
life, the industrial employee’s worker role 
gradually became second in importance only 
to his religion. As a consequence, during 
the early phases of our country’s expanding 
economy, a developing labor .orce felt some 
deep identification not only with their occupa- 
tion but with their nation’s pioneering spirit. 
But the vigorous industrial pioneer of yester- 
day is the older worker of today, and the 
psychological support he has come to expect 
and need from his work-role is somehow 
threatened by our recent trend toward auto- 
mation. This automation has subsequently 
resulted in a depersonalization of the older 
worker's previously paternalistic industrial 
setting: a depersonalization, which has fos- 
tered structural changes in the industrial en- 
vironment to which the older worker has not 
become reconciled. 

Now the problem of whether psychological 
adjustment is equally dependent upon the 
worker role for two occupational groups, such 
as nonmanual and manual employees, would 
seem to be of some importance. In fact, 
Burgess and his associates claim that the 
nonmanual employee is better prepared for 
aging and retirement, in general, than his oc- 
cupational counterpart, the manual laborer 
(Burgess, Corey, Pineo, & Thornbury, 1958). 
Therefore this paper will attempt to clarify 
the following question: Is the personal and 
social adjustment of managers, supervisors, 
professional-technical and clerical-sales_per- 
sonnel (nonmanual employees) equally de- 
pendent upon their role of worker, as the per- 


1 The author is particularly indebted to Ernest W. 
Burgess for his kind counsel and helpful criticism of 
the manuscript, and also to his colleague, Peter C. 
Pineo, who helped to collect and code the data. 
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sonal and social adjustment of skilled, semi- 
skilled and unskilled personnel (manual la- 
borers). 

However, before approaching an investiga- 
tion of the preceding question, it is neces- 
sary to arrive at a suitable definition of the 
worker role. In a general statement of the- 
ory, Sarbin (1956) defines the concept of 
“Role” . a patterned sequence of 
learned actions or deeds performed by a per- 
son in an interaction situation.” 

Furthermore, Havighurst (1957), discussing 
the implications of competent role perform- 
ance to personal and social adjustment, writes: 


as 


Consequently, since a social role is both a social 
expectation and a self-expectation, performance in 
the common social roles should be related both to 

. social adjustment in the sense of a smooth ad 
justment to one’s society, and to personal adjust- 
ment in the sense of satisfactory feelings about one 
self. 


For the purpose of this analysis, then, it 
will be sufficient to define the worker role in 
the following terms: The work-role is a pat- 
terned sequence of actions or deeds, learned 
for the purpose of utilitarian productivity, 
performed by a person in an industrial situa- 
tion and leading, through their competent per- 
formance, to a satisfactory manipulation of 
that person’s occupational skills, to a social 
adjustment with his fellow workers and to a 
personal self-esteem. 

Competency in the work-role therefore im- 
plies three conditions: first, that the person 
skillfully manipulates the tools of his occu- 
pation; secondly, that he “enjoys” using these 
tools; and thirdly, that he employs these 
tools with a certain ‘flair,’ savoir-faire, or 
inventiveness. Although the measure of 
worker role used in this paper does not spe- 
cifically refer to these three conditions, it was 
designed with them in mind and consequently 
infers a qualitative relationship to them. ...- 





Methods 


$01 Ss between the ages of 55 and 65 and all em 
ployees of a Midwestern oil refinery, were divided 
into the classifications nonmanual and manual work 
ers. The former classification included 116 of the 
company’s managerial, supervisory, professional- 
technical, and clerical-sales personnel, while the lat- 
ter classification contained 185 of the company’s 
skilled, semiskilled and unskilled laborers 

The Ss participated in a survey made by the In 
dustrial Relations Center of the University of Chi- 
which time they anonymously answered 
either Yes, No, or Undecided to the 100 statements 
in the Retirement Planning Inventory prepared by 
Burgess and Mack and published by the Industrial 
Relations Center 

Eighty-three of the 100 statements in the Retire- 
ment Planning Inventory were written by the au- 
thors, the remaining 17 items were taken from other 
standardized sources. These 17 items constitute two 
of the four measures reported in this paper—“Per 
sonal Adjustment” and “Job Satisfaction.’”’ The 10 
items in the Inventory which presumably measure 
adjustment were derived from an item 
analysis of the Cavan and Burgess “Study of the 
Personal Adjustment of Old People.” Job satisfac- 
tion is measured in the Retirement Planning Inven- 
tory by the seven items in the SRA Employee In- 
ventory found to constitute the factorial trait of job 
satisfaction. The other two diagnostic measures in 
this paper—‘Social Adjustment” and the “Worker 
Role’—were developed from the 80 items written 
by Burgess and Mack. Social adjustment was gener- 
ated from an item analysis (phi) of data collected 
on over 1000 older people 

The measure of “Worker Role” 
planation. Five judges (hopefully a cross section in 
miniature of society) were asked to go over the 100 
statements in the Inventory and select those which 
seemed to touch on various aspects of a person’s 
role as worker. The judges were told to rely mainly 
on their own discretion in selecting items. In order 
to standardize the procedure, however, the judges 
were given a set of instructions to use as a basis for 
their selection. In order for a question to qualify 
aS a measure of the worker role it had to meet the 
following basic requirements: “Question applies to 
the respondent’s present job only as it relates to his 
present economic, emotional and old age role. It 
does not involve any relationships with his family, 
It does not apply to his fu- 
ture plans for an occupation.”’ Questions which in- 
ferred any family, friends, or co-worker relationships 
were purposely excluded from consideration in order 
to keep the measure of worker role as’ free from 
other variables as possible. A binomial expansion 
was then calculated to determine a point of agree- 
ment between the five judges that would meet or 
exceed the .01 level of 
which then met or exceeded the 


cago, at 


personal 


requires some ex- 


friends or co-workers 


confidence. Those items 
.01 level of agree- 
ment between the five judges were accepted as in- 


dic ies of the worker role Five questions were sub- 


lable 1 


orrelation Analysis Between Three Psychologica! 
Factors and the Worke. Role, by 


Occupation Status 
(Adapted from Corey*) 


Worker Role 


Non 
manué Manual 
Personal Adjustment 08 24* 
Social Adjustment 19* 34* 
Job Satisfaction 25* 26* 


* Correlation significant at and beyond the .05 level of con 


fidence 
* Previously 


Line ps) 


unpublished data of a larger project to deter 
chological factors related to various old-age roles 


sequently selected by this method and constitute the 
measure of worker role used in this paper. These 
five questions and their favorable responses (prede- 
termined by Burgess and Mack) are available upon 
request from the author. 


Results 


Having theoretically and operationally de- 
fined the role of worker, we can now proceed to 
a discussion of our original question. Table 1, 
adapted from Corey (1957), presents a prod- 
uct-moment correlation analysis in which per- 
sonal adjustment, social adjustment and job 
satisfaction are analyzed for their relation- 
ship to the work-role. Note that in Table 1, 
the nonmanual and manual employees are 
treated as separate populations in order that 
comparisons can be made between the two. 

The relationships in Table 1 can be gener- 
alized as two statements: 

S1: While the personal adjustment of non- 
manual employees is not related to their ac- 
tivity in the role of worker, the maximum 
degree of positive personal adjustment for 
manual employees varies directly as a func- 
tion of increasing competency in their per- 
formance of the work-role. 

S2: The maximum social adjustment and 
job satisfaction of both nonmanual and manual 
employees varies directly as a function of in- 
creasing competency in their performance of 
the worker role. 

The single difference, therefore, between 
the nonmanual and manual employee is the 
degree to which personal adjustment is re- 
lated to the competent performance of their 





Occupational Differences 


respective work-roles. Since, however, per- 
sonal adjustment has already been defined as 
“satisfactory feelings about oneself,” this es- 
sential difference is a crucial one. 


Discussion 


It can be concluded from this analysis that 
the worker role, as a source of psychological 
support for the older employee, is directly as- 
sociated with the personal adjustment of the 
manual laborer, while there is little or no 
association between personal adjustment and 
the work-role for the nonmanual employee. 
Furthermore, it can be inferred, as a result, 
that increasing depersonalization of the in- 
dustrial setting has fostered irreconcilable 
conflicts not for the nonmanual employee, but 
rather, for the manual laborer, whose field of 
self-expression, other than his job, is neither 
as diversified nor psychologically rewarding 
as that of his occupational counterpart. 

On the other hand, the relationships in S2 
probably reflect the high premium which an 
individual’s society, regardless of his socio- 
economic status, places on the worker role. 
In that event broadly defining “Social Adjust- 
ment” as an individual’s group participation 
versus his social isolation, it can be assumed 
that the worker role, among other things, is 
significantly related to both the nonmanual 
and manual employee’s social adaptation. 

This association, however, undoubtedly ex- 
ists quite apart from the industrial deperson- 
alization which we previously discussed. In 
other words, it is not inconceivable that an 
individual’s job may qualify him for social 
acceptance as a productive and economically 
active member of his society, but it may not 
provide him with an immediate personal satis- 
faction. As we have already mentioned, the 
extent to which such a personal satisfaction 
or adjustment proceeds from the worker role 
depends in a large part on the individual’s 
occupational status and his work-role com- 
petency. 


Summary 


301 Ss, all between the ages of 55 and 65, 
were divided into two occupational statuses, 
nonmanual and manual workers. The former 
status included 116 of the managerial, su- 
pervisory, professional-technical, and clerical- 
sales personnel of a Midwestern oil refinery, 
while the latter status contained 185 of that 
company’s skilled, semiskilled, and unskilled 
laborers. Both groups were then treated as 
separate populations in an analysis of the 
worker role as it related to personal adjust- 
ment, social adjustment, and job satisfaction. 
It was found in the population: studied that 
the personal adjustment of nonmanual em- 
ployees was not significantly related with their 
work-role competency, while the personal ad- 
justinent of manual employees showed a sig- 
nificant correlation with the worker role vari- 
able. Both social adjustment and job satis- 
faction were significantly related with the 
worker role regardless of occupational status. 
It was therefore concluded that the degree to 
which personal adjustment is related with the 
worker role depends to some extent upon an 
employees’ occupational status. 
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FACTORS IN SUPERVISORS’ PERCEPTIONS OF 
PHYSICAL SCIENCE RESEARCH PERSONNEL ' 


ROBERT E STOLTZ 


Southern Methodist University 


A commonly used criterion of productivity 
in areas where there appears to be a lack of 
a more objective criteria is the rating of a 
person’s work performance by a superior. 
Many studies have pointed out the factorial 
complexity of these ratings in widely diverse 
work areas. To date little has beeen done to 
investigate this problem in the area of physi- 
cal science research work. The present study 
is an initial exploratory attempt to determine 
whether or not such complexities exist when 
supervisors are asked to describe the behavior 
of research workers in physical science fields 
and to tentatively identify any such factors 
that might be found. 


Method 


Intensive interviews were held with each of 27 re- 
search supervisors in a large, Midwestern research 
organization. This organization is devoted almost 
entirely to research in problems of the physical sci- 
ences and engineering. The organization is divided 
into divisions, each of which conducts research in a 
specific subject matter field. The persons interviewed 
were all heads or assistant heads of these divisions. 
The interviewed persons were asked to describe in 
detail the research behavior of the most productive, 
least productive, and most creative man in.their di- 
vision. The interviews might be described as some- 
what nondirective and were based on a modification 
of the Flanagan Critical Incident technique. The 
divisions covered by the interviews were engaged in 
research in such diverse areas as ceramic chemistry, 
reactor metallurgy, mechanical engineering, chemical 
engineering, and nonferrous metallurgy. 

From the interviews, a pool of over 500 state- 
ments, or items, was extracted on an a priori basis 
by the investigator. With the aid of eight psycholo- 
gists 225 of these items were grouped into 15 clusters 
of 15 items each. These items, with 25 additional 
items which were felt by the judges to be of interest, 
but which could not be categorized, were assembled 
into a checklist. This checklist was termed the Pro- 
ductive Behavior Checklist (PBC). Each page of 
the checklist contained 30 items and the pages of 


‘The author would like to thank the Battelle Me- 
morial Institute for their cooperation and assistance 
in conducting this study, and C. L. Shartle, H. B. 
Pepinsky, and R. J. Wherry for their advice and 
technical assistance. 


the booklets were assembled in such a way that no 
two checklists contained the pages in the same order. 
This was done in order to avoid any consistent tend- 
ency of the raters to evaluate items according to 
their serial position of presentation. 

Forty heads and assistant heads of research divi- 
sions, including 13 of the original interviewees, were 
given two copies of the PBC and instructed to de- 
scribe the research behavior of two persons, the most 
productive and the least productive man in their di- 
vision. Twenty of the supervisors were told to rate 
the most productive man first and 20 were told to 
rate the least productive man first. Each item was 
to be rated on a five-point scale according to how 
well the item described the man in question. A rat- 
ing of five indicated an item that was very descrip- 
tive of the man being rated. A score for each clus- 
ter was obtained by summing the ratings for the 
items grouped within that cluster. 

Product-moment intercorrelations were computed 
between each item and each of the cluster scores. 
The entire set of 250 items was factor analyzed by 
the Wherry-Winer technique (1953) and the factors 
were rotated to orthogonality and simple structure. 


Results 

Five significant factors were obtained. Since 
it is impractical in the present space to give 
all of the items and their complete factor load- 
ings, only a sample of those items that aided 
in identifying the factors will be presented 
here.* Table 1 shows this sample and the 
loadings of each item on the five factors. 

The most difficult factor to interpret is 
Factor I. While precautions were taken to 
avoid as much as possible the difficulty of 
encountering halo effect, it would be unre- 
alistic to assume that it was evaded com- 
pletely. Consequently, Factor I, which shows 
high positive loadings for the bulk of the 
items, might be best regarded as a general 

2A table giving a complete list of the items from 
the checklist and their final rotated factor loadings 
has been deposited with the American Documenta- 
tion Institute. Order Document No. 5885 from the 
ADI Auxiliary Publications Project, Photoduplica- 
tion Service, Library of Congress, Washington 25, 
D. C., remitting in advance $1.25 for 35-mm. micro- 
film or $1.25 for 6 X 8 in. photocopies. Make checks 
payable to: Chief, Photoduplication Service, Library 
of Congress. 
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Table 1 


Factor Loadings of Selected Sample of Items From PBC Grouped by Major Loadings 


Item 


Can organize the work of others 

Can tell good from bad ideas 

Good analytical ability 

Does things on his own 

Makes good use of time 

Is realistic in estimating a situation 

Comes up with new ways of doing things 

Reports do not need to be rewritten 

Does not offend others 

Is not irritable 

Makes friends readily 

Has no difficulty working with others 

At times may upset others 

Wants recognition 

Has a need to be recognized 

Is not impatient with others 

Will break his back to produce 

Willing to put in extra time 

Worries about how the job is going 

Has real interest in job 

Takes part in many social activities 

Has a lot of outside interests 

Does not fly off the handle 

Writes so anyone can understand it 

Is not hard for him to write reports 

Seldom need to make changes in his reports 

Can express himself in everyday terms 

Can make things clear to others 

Goes out of his way to help peopl 

Is liked by others 

Is not afraid to go to experts for help 

Has lots of ideas 

Has imagination 

Has developed creative ability 

Thinks there must be a better way of doing things 
Can evaluate alternative approaches to the problem 
Extremely neat and orderly 

Likes routine work 

Can’t stand to be unsuccessful 

Note ll decim points have been omitted 

factor indicating the extraction of halo ef- 
fects. However a number of the items do 
not show high positive loadings on this fac- 
tor, but do appear as the major contributors 
to Factor II. Since the items were initially 
extracted from the interviews because of their 
apparent relevance to the productive process 
and since 95% of the items were worded fa- 
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vorably, it might be expected that Factor I 
would represent a general validity factor in- 


dicating the extent to which each item is 
highly evaluated in describing a productive 
person. Within the framework of the design 
used in this study it would be best to inter- 
pret this factor as describing General Produc- 
tive Behavior and in a sense confounding halo 
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effect and a crude validity index. The items 
which showed the largest loadings on this fac- 
tor dealt with analytical thinking, technical 
knowledge and skill, work-oriented organiza- 
tional ability, willingness to assume responsi- 
bility and take independent action, and tech- 
nical report writing skill. Items showing low 
or négative loadings on this factor dealt with 
interest in social activities, liking for routine 
or administrative work, agreeableness and 
pleasant personality features, and feelings of 
personal inadequacy. 

Factor II has been tentatively named Affa- 
bility. It appears to contain items dealing 
with behaviors that tend to make one liked 
by others. Persons rated high on this facter 
would seem to be agreeable, pleasant and 
good group members. It is of interest io 
note that the items which load on this factor 
that deal with freedom from aggressive acts 
or sensitivity to other’s feelings generally show 
low or negative loadings on the General Pro- 
ductivity factor. While not investigated in 
this study a suggested hypothesis for future 
study might be that the productive research 
worker is more prone to aggressive attacks on 
others in his work group than the nonproduc- 
tive researcher. In the light of the high 
value given behaviors reflecting strong moti- 
vation on Factor I, we might expect these ex- 
pressions of aggression to be specific to situa- 
tions in which the expression of productivity 
is blocked by some person or thing and to 
be more common among the more highly 
motivated productive researcher. This inter- 
pretation, or hypothesis, would appear to be 
consistent with current frustration-aggression 
theory. 

Factor III has been tentatively named Mo- 
tivation. The items receiving the largest load- 
ings on this factor deal predominantly with 
industriousness, willingness to exert effort, and 
interest in the job. The major negative load- 
ings on this factor come from items dealing 
with patience, calmness, and control of tem- 
per. The supervisor apparently sees the highly 
motivated researcher as one who is more prone 
to some expression of anger. Also negatively 
related to this factor were items dealing with 
the social activities of the Ss both within and 
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without the company framework. The items 
loading on this factor tend to show high posi- 
tive loadings on the General Productivity 
factor. 

Factor IV reflects the relevance of com- 
munication to the supervisors. The items 
showing the highest loadings on this factor 
are primarily concerned with the ability of 
the person to write effectively and to com- 
municate his ideas clearly to others. This 
factor has accordingly been named Ability to 
Communicate. The existence of a possible 
stereotype among the supervisors is indicated 
by the fact that persons rated high on this 
factor might be expected to be rated low on 
his willingness to help others. 

As might be expected in a work area such 
as this, a factor describing creative activity 
was obtained. Factor V is made up princi- 
pally of items dealing with versatility, imagi- 
nativeness, and ingenuity. This factor has 
been named Creative Ability. Again we find 
what might be a supervisory stereotype in 
that items dealing with liking for routine 
work, neatness, orderliness, and methodical- 
ness are negatively related to this factor. 


Summary and Conclusions 


Forty physical science research supervisors 
described the behavior of productive and non- 
productive research personnel using a 250- 
item checklist derived from interviews with 


research supervisors. A factor analysis of the 
items comprising the checklist resulted in 
finding five significant factors. These factors 
have been tentatively named General Produc- 
tivity, Affability, Motivation, Ability to Com- 
municate, and Creative Ability. 

These dimensions will be useful in develop- 
ing rating scales to more adequately assess 
the research behavior of persons in this insti- 
tution and perhaps in others. The analysis 
also provides material for several hypotheses 
which might be investigated in other studies. 
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EVIDENCE OF A PRACTICE EFFECT ON THE 
MILLER ANALOGIES TEST 


CHARLES D 


SPIELBERGER 
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The Miller Analogies Test (MAT) is cur- 
rently widely employed in the selection of 
psychology graduate students. Nearly half 
of the institutions which offer advanced de- 
grees in psychology and almost two thirds of 
those which grant the Ph.D. require or recom- 
mend that the MAT be submitted with appli- 
cations for graduate training (Moore, 1957). 
The demonstrated validity of the MAT in 
the prediction of academic success (Cureton, 
Cureton, & Bishop, 1949; Fahey, 1953; Jen- 
sen, 1953; Kelly & Fiske, 1951; Miller. 
1952) suggests that individuals performing 
poorly on this test are likely to be viewed as 
unfavorable scholastic risks. 

The MAT is required of all applicants to 
graduate programs in psychology at Duke 
University. In a number of instances where 
applicants had reported scores on two of the 
alternate forms of the test, a marked im- 
provement from the first to the second ad- 
ministration of the MAT was frequently 
noted. Since the MAT is in wide general 
use and since its employment often involves 
cutoff scores below which an applicant’s op- 
portunity for graduate training may be seri- 
ously prejudiced, the following studies were 
designed to test the validity of the observa- 
tion that scores on the MAT improve with 
practice. 

Experiment | 

Method. Form H of the MAT was group ad- 
ministered to 20 first-year psychology graduate stu 
dents at Duke University in the fall of 1956. The 
test was given under standard conditions! except 
that the Ss were specifically told that the results 
would be used only for research purposes. All of 
the Ss had taken Form J of the to their 
uceptance for graduate training 


test prior 


The author is indebted to Robert Colver of the 
Duke University Bureau of Testing and Guidance, 
a licensed MAT testing center, for his supervision ot 
the administration of the MAT and his suggestions 
regarding research design. The cooperation and in 
terest of Harold Seashore and the Psychological Cor 
poration in making the MAT available for this re 
search is gratefully acknowledged 


Results. The means* and SDs for Forms 
H and J and the correlation between these 
forms * are presented in Table 1 where they 
are compared with similar data on 135 gradu- 
ate students reported in the MAT Manual 
(Miller, 1952, p. 6). The difference between 
the means of Form H and Form J was found 
to be highly significant (t¢ = 4.01; p < .001) 
Seventeen of the twenty Ss improved their 
scores on taking the test a second time. This 
finding suggested two possible interpretations: 
(a) Form H was easier than Form J, or (5) 
there was a practice effect in taking the MAT 
which resulted in improved performance after 
an initial experience with the test. 

Individual interviews with 18 of the Ss 
tended to support the latter hypothesis, and 
suggested that the improvement was due to 
greater familiarity with the nature of the test 
Two thirds of these Ss reported that they be 
lieved they had done better on the second 
test. Although the improvement was at- 
tributed to many factors, the Ss stated most 
frequently that they “knew what to expect” 
the second time they took the test. A num- 
ber of Ss also reported that they were less 
nervous the second test and 
several specifically related their decreased 
anxiety to greater familiarity with the test. 
Other reasons given to account for felt im 
provement by at least two Ss were: did not 
finish the first time: second form of the test 


anxious or on 


2A correction is required in order to make scores 
on Forms J and H equivalent to scores on Form G 
This correction, which consists of adding two points 
to Form H and Form J scores in the 30 to 70 raw 
score range, was made wherever required in the pres 
ent study 

> Although the coefficient of equi 
present study was substantially lk 
ported in the MAT Manual, a comparison of the 
SDs in Table 1 that could be at 
tributed to the restricted range of talent consequent 
to the employment of the MAT as one of the cri 
teria in the initial selection of these Ss. When re 
calculated with appropriate corrections for the re 
stricted range, the obtained reliability coefficient wa 
consistent with those reported between alternate 
forms in the MAT Manual (Miller, 1952 


valence in the 
»wer than those re 


suggested this 


59 
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Table 1 


Means and SDs for Two Alternate Forms of the Miller Analogies Test and 
Correlations Between These Forms 


Sample 


Duke Students 
Normative Ss (Miller, 1952 


20° 
135 


Mean 


Form H 


Form J 


SD 
6.5 78.00 7.0 
15.9 57.3 16.0 


Mean SD 


* In addition to Form J, one S had previously taken Form G and another Form H prior to the group-administration of Form 


H on which part of the data of Table 1 is based. 
Ihe difference remained significant at the .001 level 


was easier; pondered less over ambiguous 
analogies which permitted better utilization 
of the total time. Only three Ss felt they did 
worse on the second test and each believed 
that this was because the second form of the 
test was harder. However, all three of these 
Ss actually improved. In general, there ap- 
peared to be little consistency between the 
extent of the change and the Ss’ estimates of 
their own performance. In order to cross- 
validate and test the generality of the find- 
ing of improvement in MAT score on retest 
a second experiment was performed. 


Experiment II 


Method. Forms G and H of the MAT were ad- 
ministered in counterbalanced order to a second sam- 
ple of 17 first-year graduate students in the fall of 
1957. Since all of the Ss had taken Form J prior 
to their admission to the University, it was possible 
to constitute two groups matched on the basis of 
their scores on this form. Nine Ss (Group I) were 
first given Form H followed after two weeks by 
Form G; eight Ss (Group IT) were given these same 
forms in reversed order 


Results. The means and SDs for all 17 
Ss for three successive administrations of the 
MAT are presented in Table 2. The mean 
score obtained for the Ss on their second 


Table 2 


and SDs for Three Successive Administrations 
of the Miller Analogies Test 


(N = 17) 


First Test Second Test Chird Test 


Mean SD Mean SD Mean SD 


73.71 8.0 78.94 5.9 79.53 60 


Elimination of these Ss yielded means for Form J of 73.50 and Form H of 78.89 


experience with the MAT was significantly 
higher than the mean score for their initial 
performance on the test (¢ = 3.35; p < .01). 
The hypothesis of no difference between the 
mean scores for the second and third test 
performances could not be rejected. These 
findings affirmed the generality of the im- 
provement on an alternate form of the MAT 
subsequent to an initial experience with Form 
J, and suggested that little additional im- 
provement was likely to result from taking 
the test a third time. Although the possible 
lack of equivalence between Form J and the 
other forms of the MAT still could not be 
ruled out,* the equivalence of Forms G and 
H and the effects of the order in which they 
were taken could be evaluated by analysis 
of variance (Lindquist, 1953, pp. 260-264, 
simple latin square design) since these forms 
had been given in counterbalanced order. 
The means and SDs for Groups I and II on 
Forms G and H are presented in Table 3. 
The F test of the effect of order was not sig- 
nificant which further indicated that there 
was no appreciable practice effect from a 
second to a third administration of the MAT. 
Although Form G tended to be more difficult 
than Form H for both groups of Ss, this 
difference only approached statistical signifi- 
cance (F = 3.89; p< .10). A third experi- 
ment, in which the effect of alternate forms 

*It was not possibile .9 test directly the equiva 
lence of J and the other forms since it is apparently 
the current practice at MAT testing centers to give 
Form J, the most recently developed form of the 
test, to most new applicants for examination. Ideally, 
it would have been desirable to administer Form J 
and an alternate form of the MAT in counterba! 
anced order to Ss who had no previous experienc 
with the test. This type of experimental desig:. 
would simultaneously provide a test of equivalence 
of forms and the effect of practice 
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Table 3 
Means and SDs for Forms G and H of the Miller 
Analogies Test Given in Counter 
balanced Order 


(.V = 8 in each group) 


Form H Form G 


Sample Mean SD Mean SD 


80.50 8.08 
80.75 7.60 


Group I* 
Group II 


19.99 
78.62 


* For this analysis one of the Ss in Group I was randomly 
eliminated in order to have equal Ns in each group. 


was taken into account, was designed to fur- 
ther evaluate improvement on the MAT after 
an initial experience with the test. 


Experiment III 


Method. Forms H and G were administered to 
eleven senior psychology undergraduate honors stu- 
dents who had no previous experience with the test 
Form H was given first followed after two weeks 
by Form G. Since Form H had been previously 
demonstrated to be either equivalent or slightly 
easier than Form G, any lack of equivalence be- 
tween these forms would presumably operate to 
minimize a practice effect, ie. if an easier form 
were given prior to a more difficult form, a higher 
mean score on the easier form would be expected if 
there were no practice ‘effect 


Results. The mean scores for Forms H 
and G were 70.55 and 75.27, respectively, 
and the difference between these means was 
statistically significant (¢ = 2.73; p< .05). 
This finding of improvement in an alternate 
form of the MAT after an initial experience 
with an equivalent or easier form of the test 
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gives strong evidence of a practice effect. Ad- 
ditional evidence of the consistency of the 
improvement, and the high reliability of the 
MAT, was the correlation of .92 between 
Form G and H for the undergraduate Ss. 


Discussion 


To the extent that Form J was equivalent 
to Forms G and H, the data from the three 
samples in the present study consistently in- 
dicated a substantial practice effect on the 
MAT. These results contrast markedly with 
the equivalence between alternate forms re- 
ported in the MAT Manual (Miller, 1952) 
for three large independent samples of gradu- 
ate and senior undergraduate students. A 
possible explanation of the apparent discrep- 
ancy in results might be found in the differ- 
ences between the populations sampled. The 
Ss in the present study were all psychology 
students, a population which might be ex- 
pected to be more sophisticated in taking 
tests. Also, the Ss in the present study scored 
considerably higher than those in the MAT 
normative samples whose mean scores for the 
several forms of the MAT ranged from 49.6 
to 58.5 (8th to 18th percentile on norms for 
psychologists). It might be speculated that 
bright, psychologically sophisticated Ss would 
be most likely to profit from experience with 
a test such as the MAT, especially if their 
initial scores were depressed because they did 
not know what to expect on the test. 

In situations where the MAT is employed 
with a cutoff score, a low initial score on the 
test is likely to be significant in the 
evaluation of graduate students. 


most 
potential 


Table 4 


Means and SDs of Improvement Scores on the Miller Analogies Test 


ange of Initial Scores 


Raw Pet 


Group Score centiles* 


82-88 
70-79 
45-69 


78-93 10 
43-70 23 
4-40 15 


the norms for psy 


* Percentile range is based or 
The expected improvement in percentile rank is 


Percentage 


Improved 


hology graduate students 


Improvement Scores Expected 

Improvement 

in Percentile 
Rank® 


of Ss 
Mean 


40 90 
87 4.26 
100 9 53 


82 to 78 
52 to 66 


30 to 54 


Miller, 1952) 


presented in terms of the change expected of an S 
core equals the mean impr »ment for the group rt 


Miller. 1952) 


the me 
} 1 


pased 


t 
lian 
mn tl 


of the group and whose improvement 


ithe norn vchology graduate students 
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Therefore, the relationship between initial 
score on the MAT and improvement in score 
on a subsequent administration of an alter- 
nate form of the test was investigated. For 
this analysis, the data for the three samples 
of Ss were combined and the total sample 
was redivided into three groups which con- 
sisted of Ss whose initial scores were 80 and 
above (Group A), 70 to 79 (Group B), and 
69 and below (Group C). Improvement 
scores were obtained for each S by subtract- 
ing his initial score from his score on taking 
the test a second time. Of the 48 Ss, 39 im- 
proved. The means and SDs of the improve- 
ment scores and the percentage of Ss in each 
group who improved are presented in Table 4. 
where it may be noted that all of the Ss in 
the group with the lowest initial MAT scores 
improved and that this group showed the 
greatest mean improvement. The Ss with the 
highest initial scores did not tend to show 
any systematic improvement in taking the test 
a second time. An analysis of variance of 
the improvement (Lindquist, 1953, 
simple randomized design) yielded highly 
significant over-all differences in group means 
(F = 12.67; p< .001). Individual ¢ tests 
indicated that the differences in mean im- 
provement between Groups A and B (¢- 
2.66; p< .02), A and C (#¢=4.97; p- 
001), and B and C (#=3.10; p.< 01) 
were all greater than would be expected by 
chance. Thus, the effects of practice on the 
MAT scores were most facilitative for Ss with 
low initial scores and the amount of improve- 
ment was inversely related to initial MAT 
scores. A Pearson product-moment correla- 
tion of — .50 indicated the extent of the 
inverse relationship between _ initial 
scores and improvement scores. 

It should be noted (a) that those Ss who 
showed greatest improvement scored in the 
range of the MAT which might be considered 
most important from the standpoint of gradu- 
ate student selection, and (6) that the MAT 
is extremely sensitive for this range of scores, 
i.e., small improvements in raw score result 
in large increments in percentile rank. The 
percentile rank of the median initial MAT 
score in each group and the percentile rank 
which would be expected if an individual with 
this initial made an improvement in 


scores 


linear 


score 
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score equal to the mean improvement for his 
group is given in Table 4. 

Although the findings reported in this pa- 
per do not directly challenge the validity of 
the MAT as a predictor of success in gradu- 
ate training, the application of this instru- 
ment to selection problems, especially in cases 
where cutoff scores are employed, requires 
that practice effects be taken into account. 
The relative validities of an applicant’s ini- 
tial and second MAT scores as predictors of 
his academic potential must be determined 
in order to optimally utilize the MAT in the 
selection of psychology graduate students. 


Summary 


This study was designed to test the hy- 
pothesis that scores on the Miller Analogies 
Test improve with practice. A group of 20 
first-year psychology graduate students, all of 
whom had previous experience with one form 
of the MAT, were retested with an alter- 
nate form of the test. Seventeen of these Ss 
improved their scores and the mean increase 
in score for the group was highly significant. 
The reason most frequently given for their 
improvement by these Ss was that they “knew 
what to expect” in their second experience 
with the test. This finding of a practice ef- 
fect on the MAT was cross-validated on a 
second sample of psychology graduate stu- 
dents who also showed a significant mean im- 
provement in score on retest and further con- 
firmed by a similar finding on a third inde- 
pendent group of undergraduate senior psy- 
chology honors students. Of a total of 48 Ss, 
39 improved. The improvement in scores ap- 
peared to be unrelated to the particular al- 
ternate forms of the MAT employed. In or- 
der to evaluate the relationship between ini- 
tial MAT score and improvement in score on 
retest, the data from the three samples were 
combined. The magnitude of improvement 
was found to be inversely related to the Ss 
initial score on the test. Maximum improve- 
ment in scores occurred for that range of the 
MAT which might be considered most im- 
portant from the standpoint of graduate stu- 


dent selection. 
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A bureaucratic social organization contains 
units of members whose functions are spe- 
cialized, and communication between units 
is narrowly channelized. Typically, a top 
managing officer has the most authority to 
make operating decisions for the organization. 
Working under him in hierarchically ordered 
lines of authority are successions of persons 
who have smaller and smaller amounts of au- 
thority, until—at the bottom of the heap— 
workers whose authority is limited to the 
performance of particular, assigned organiza- 
tional tasks are found. As top management 
modifies its operating decisions in response to 
changing external conditions, task conditions 
can be expected to vary for workers at the 
bottom of the organizational hierarchy (Gerth 
& Mills, 1946, pp. 196-224). 

Conditions within an industrial plant and 
their consequences for worker productivity 
have served as a natural metaphor for a labo- 
ratory experiment, in which the dependent 
variable, team productivity, is defined as the 
measured amount of successful accomplish- 
ment by a work team of an assigned task. 
Three independent variables are employed: 
(a) A three-level hierarchical group structure 
is defined as a set of relationships among 
three classes of persons in the performance of 
a group task, such that A directs and limits 
the task performance of B, who in turn is ap- 
pointed to coordinate the performance of C. 

1 Research conducted under Contract Noori-17, 
T.O. ITT (NR171-123) between the Office of Naval 
Research (Group Psychology ‘Branch) and the Ohio 
State University Research Foundation. This paper 
is a condensation of a detailed technical report 
(Pepinsky, Pepinsky, Minor, & Robin, 1957), which 
may be obtained on loan from the Gifts and Ex- 
change Department of the Ohio State University 
Library. 

2 Formerly at the Ohio State University 


In the experiment, these are represented by 
a vice-president, a team member appointed 
as department head, and other members of a 
work team, respectively. Relationships with 
a fourth person, D, are also required, such 
that B, the department head, must conduct 
transactions with D in the performance of the 
group task. Depending upon which role is 
appropriate at a particular time, D is desig- 
nated as either supplier or buyer. All trans- 
actions between the work team and the vice- 
president or supplier and buyer must be con- 
ducted for the team by their department 
head, and all transactions between depart- 
ment head and supplier must have prior ap- 
proval by the vice-president. (b) The ex- 
perimental task is defined as one that requires 
coordinated team effort and yields quantifi- 
able and reliable measures of task perform- 
ance. (c) The vice-president’s commitment 
to the department head is defined as the vice- 
president’s advance statement of the sanction 
he will give to any prospective transaction be- 
tween department head and supplier: 

The confirmation condition is one in which 
the vice-president, by his subsequent action, 
corroborates his prior commitment; he gives 
advance notice of his intent to approve or dis- 
approve and behaves accordingly. 

The contradiction condition is one in which 
the vice-president, by his subsequent action, 
fails to corroborate his prior commitment; he 
gives no advance notice of his intentions 
either to the team or to their department 
head, and his approval or disapproval is based 
upon information not available to them. 

Under the confirmation condition, a team 
is able to predict in advance the president’s 
approval or disapproval of transactions be- 
tween department head and supplier. It is 
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thus assumed that the confirmation condition 
elicits responses that are compatible with 
those appropriate to successful task accom- 
plishment. Under the contradiction condi- 
tion, however, a team is not able to make ad- 
vance predictions of what the vice-president 
will approve or disapprove. Since the latter 
condition is assumed to arouse stereotyped 
or otherwise incorrect responses that compete 
with responses appropriate to successful task 
completion, it is hypothesized that produc- 
tivity will be lower under the contradiction 
condition than under the condition of con- 
firmation. 
4 Method 


Eighty white, male Ss, enrolled in the introductory 
psychology course at the Ohio State University, had 
volunteered for the study to meet a partial course 
requirement. Each S was assigned to a four-man 
experimental team on the basis of scheduling con- 
venience and an effort to avoid placing in the same 
group Ss who were already well acquainted with 
each other. 

The experimental task was a toy manufacturing 
problem (adapted from Hemphill, Pepinsky, Kauf- 
man, & Lipetz, 1957), which required a team of four 
Ss to operate the toy model production department 
of a small factory. Their assignment was to buy 
tinker toy parts for different kinds of toys, assemble 
the toys, and sell them. One of the team, selected 
at random, was appointed initially as department 
head. Team members worked in the laboratory at 
a shop table, placed in front of a one-way mirror, 
through which they could be observed. In the cen- 
ter of the room stood two tables for an E, who 
served at one time as supplier of parts and, as oc- 
casion demanded, as buyer of assembled toys. Be- 
hind these tables was a screen, on the other side of 
which the vice-president in charge of production, 
another E, sat at his desk 

The team had three consecutive work sessions, each 
consisting of an initial 5-min. planning period, fol- 
lowed by a 20-min. work period in the first session 
and by 25-min. work periods in both the second and 
third sessions. No production work was allowed in 
the planning periods, but during the work periods, 
the team was to order parts for, assemble, and sell 
toys in order to realize the maximum profit within 
the allotted time. Five toy models were displayed 
at the team’s shop, and, prior to each work session, 
every team member was given documents listing the 
cost of parts for each kind of toy and the selling 
price for which each completed toy could be sold 
Parts costs and selling prices were varied according 
to the complexity of the toy. Costs and prices, how- 
ever, were changed from session to session; hence 
the profit margins also changed 

Before each work session, the team was given $2.00 
in poker chips with which to buy parts. Parts 
could be ordered for only one kind of toy at a 
time; the purchase order form had to be signed by 
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the team and the department head and accompanied 
by sufficient funds to cover the purchase of parts 
Next, the department head had to take order form 
and funds to the vice-president for his signed O.K 
of the order. The approved order was then filled by 
the supplier, and the toys were assembled back at 
the shop. Completed orders were turned over by 
the department head to the buyer, who purchased 
the toys if they were correctly assembled. All trans- 
actions between team, vice-president, supplier, and 
buyer were conducted solely by the department head, 
the only person allowed to enter the vice-president’s 
office. 

The first work session was a control condition, 
during which all teams received identical treatment: 
the vice-president approving all correctly signed pur 
chase orders submitted to him. 

The contradiction condition was established for 
half of the teams during the second and third work 
sessions. Under this coridition, the vice-president ap 
proved or refused to approve, according to a prede 
termined pattern, purchase orders submitted to him 
by the department head. Purchase Orders No. 1, 4, 
6, 9, 11, etc. were automatically approved if correctly 
signed. Purchase Orders No. 2, 3, 5, 7, 8, 10, 12, ete 
were not countersigned by the vice-president; in 
stead, for each of these orders a note was given to 
the department head, to be read aloud to his team 
and informing them that changing market conditions 
had necessitated a temporary suspension of produc 
tion on the kind of toy ordered. An E simply 
stopped the clock when the team worked on disap- 
proved orders, however, and the team’s actual work 
time during Sessions 2 and 3 included only that 
spent on approved orders. Although production sus 
pensions were lifted according to a predetermined 
pattern, team members were never told how long a 
suspension would last. To minimize the likelihood 
of correct predictions by teams run under the con- 
tradiction condition, a promised budget increase and 
carry over of net profit from the second to the third 
work session did not materialize 

The confirmation condition was maintained through- 
out the second and third work sessions for the other 
half of the teams. Here, too, a predetermined sched 
ule of temporary work suspensions was used by the 
vice-president as a guide in approving or disapprov 
ing purchase orders submitted to him. Under this 
condition, however, the toys to be suspended and 
their times of suspension were clearly specified at 
the beginning of each work session, in the form of 
oral and typed announcements, to every team. To 
arrive at the pattern of suspensions for a given team, 
each was paired with a preceding team run under 
the contradiction condition. The empirical record 
kept for the contradiction team was used to specify 
an identical pattern of 
confirmation team. Because it’ was desired to maxi- 
mize the likelihood of correct predictions by teams 


suspensions for its paired 


run under the confirmation condition, team members 
under this condition were always correctly informed 
about their future budget allocations 

In the experiment, 20 four-man teams were di 


vided into 10 consecutive team pairs. Ten teams 
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were run under the contradiction condition and ten 
under the confirmation condition 


Results 


Several checks were made on the experi- 
mental procedure. First, a postsession ques- 
tionnaire and interviews clearly established 
that the confirmation teams could predict and 
the contradiction teams could not ever learn 
to predict what the pattern of suspensions 
would be during the experimental sessions. 
While a majority of both groups claimed that 
their task motivation was increased by the ex- 
perimental conditions, a significantly greater 
proportion of the contradiction team mem- 
bers reported that their teams became more 
disorganized and less efficient when they 
could not anticipate temporary production 
suspensions. Yet neither group of Ss re- 
garded themselves as having been punished 
during the experiment. These results suggest 
that teams under the contradiction condition 
were not less motivated to perform the experi- 
mental task nor more negatively reinforced in 
performing it than teams under the confirma- 
tion condition; the contradiction condition, 
however, did seem to elicit a greater number 
of responses that were extraneous to efficient 
task performance. 

Second, a check upon the task motivation 
effects of the confirmation and contradiction 
conditions was made by correlating amount of 
profit per order with 5-min. work periods 
within each session for teams run under each 
condition. Positive correlations for both 
groups were anticipated for all three sessions; 
the correlations for the confirmation groups 
should have been significantly greater than 
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the correlations for the contradiction groups, 
however, only if profit were to be viewed as 
a greater incentive for the confirmation group. 
All correlations were significantly positive and 
increased slightly from session to session, but 
correlations for teams run under the two con- 
ditions do not differ significantly from each 
other. This result supports the view that the 
task motivation effects of the two experi- 
mental conditions were similar. 

A third procedural check was made to de- 
termine whether the pattern of temporary 
production suspensions was comparable for 
the ten teams within each experimental con- 
dition. Because profit margins differed for 
the various toy models, the experimental 
stimuli could be regarded as comparable 
within a condition only if restrictions upon 
the production of each model were randomly 
distributed among the ten teams. The check 
was made by plotting the frequency with 
which purchase orders for the five kinds of 
toys were disapproved for each of the ten con- 
tradiction teams during the second and third 
work sessions. A chi-square test of associa- 
tion between toy suspension and team indi- 
cates that these variables are independent. 
Hence, productivity scores for the teams 
within each condition could be pooled for 
every session, and direct comparisons could 
be made between the pooled scores of the 
confirmation and contradiction groups. 

Tests now could be made of the experi- 
mental hypothesis: that team productivity 
would be greater under the confirmation than 
under the contradiction condition. In mak- 
ing these tests, it was predicted that the two 


Table 1 


Team Pair Ditferences in Net Profit 


(Confirmation team score 


Mean difference 

tsess. 1 diff. 

Regression coefficient 

tregr. coeff. 

Experimental (residual) variation 
lexper. var. 


Note. 
** Significant at .01 level. 


~ Contradiction team score 


Session 


—.14 


82 


3.36** 


Minus sign indicates confirmation scores less than contradiction scores. 
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Table 2 


leam Pair Differences in Net Profit Per Minute of Actual Work Time 


(Confirmation team score — 


Mean difference 

tsess. 1 diff. 

Regression coefficient 

tregr. coeff 

Experimental (residual) variation 

lexper. var 

Note. 
* Significant at .05 level. 

sets of teams would not differ significantly in 
their productivity during Session 1 (the con- 
trol session), but that they would differ sig- 
nificantly in Sessions 2 and 3 (the experi- 
mental sessions). The results of comparing 
the contradiction and confirmation groups in 
terms of the teams’ net profit accumulated in 
each work session are shown in Table 1. The 
initial, Session 1, difference scores are not sig- 
nificantly different from zero, nor are the re- 
gression coefficients of the Session 2 and 3 
scores on Session 1 scores. In every case, 
however, the variation attributable to the 
experimental conditions is significant: the 
amount of net profit earned by the confirma- 
tion teams is significantly greater than that 
earned by the contradiction teams in Sessions 
2 and 3 and in the combined sessions. For 
net profit, then, the stated prediction is sup- 
ported by the data. 

A more rigorous test of the hypothesis is 
provided by comparing the performance of 
confirmation and contradiction teams on the 
basis of their net profit per minute of actual 
work time. These results are reported in 
Table 2. Again, the mean difference between 
the scores of team pairs is not significantly 
different from zero. The mean differences be- 
tween the scores of team pairs that can be at- 
tributed to the experimental conditions are 
significant in Session 3 and in the combined 
second and third session, but not in Session 2. 
Since the combined Session 2 and 3 scores are 
more reliable than those of either session 
alone, the results are interpreted as support- 
ing the hypothesis. It may be noted, though, 
that the obtained differences are significant 
at the .05 level of probability, whereas for 


Contradiction team score) 


Session 


— 02 
- .09 

01 
1.02 


Minus sign indicates confirmation scores less than contradiction scores 


the net profit scores the differences were sig- 
nificant at the .01 level. 


Discussion 


In summary, the experimental results are 
consistent with the hypothesis: the confirma 
tion teams were more productive than the 
contradiction teams, both in net profit earned 
during the experimental sessions and in net 
profit per minute of actual work time. A sub- 
sidiary prediction in respect to net profit per 
number of completed orders was not reliably 
supported by the data, although even here 
the trend was in the expected direction. 

The experimental results are interpreted 
not only as providing support for the central 
hypothesis, but as lending credence to its un- 
derlying rationale. Specifically, it can be in- 
ferred that the contradiction condition lowers 
team productivity (a) because it arouses re- 
sponses that are not appropriate to the suc- 
cessful completion of the task and (6) be- 
cause the occurrence of these responses op- 
erates to reduce the frequency with which 
responses appropriate to the task can occur. 
While this rationale has internal consistency, 
it is also given empirical support by the ob- 
served behavior of the Ss in the experiment. 
Under the confirmation condition, with fore- 
knowledge of what production suspensions 
were to occur, there were but rare occasions 
in which a team seemed to have difficulty in 
adjusting from the operating freedom of Ses- 
sion 1 to the production restrictions of the 
second session. Under the contradiction con- 
dition, however, the Ss seemed to experience 
considerable difficulty in adjusting to the new 


and unpredictable state of affairs. This was 
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manifested in many ways: e.g., trouble in co- 
ordinating activity for the building of new 
toys; time spent in trying to predict what new 
orders the vice-president would disapprove; 
joking about the actions of the Es; hostility 
toward the supplier, who was ‘seen as being 
too slow or as short-changing the team in 
money or supplies; filling out orders incor- 
rectly; or in the team’s becoming immobi- 
lized. We can interpret these actions as “ir- 
relevant” in the sense that they do not fa- 
cilitate successful task completion. It should 
be kept in mind that when such irrelevant 
action occurred during the discussion of dis- 
approved orders, it was not counted against 
the actual work time of a contradiction team. 
In one form or another, however, irrelevant 
action occurred frequently enough during the 
processing of approved orders to cut down 
materially on the profit-making activity of 
the contradiction Ss. 

Alternative rationales might have yielded 
similar predictions, e.g., those provided by 
students of statistical uncertainty effects 


(Hake, 1955; Macy, Christie, & Luce, 1953), 
of anxiety (Brown & Farber, 1951; Pepinsky 
& Pepinsky, 1954; Taylor, 1956), and of am- 


biguity intolerance (Blake & Ramsey, 1951). 
Indeed, a more extended statement of the 
present rationale would indicate its indebted- 
ness to all of these contributions. A major 
implication of the experiment is that what 
has been found here may be predicted for 
other settings whose formal properties corre- 
spond to the conditions created in the labo- 
ratory: (a) where task efficiency is a major 
criterion of group productivity, (b) where 
the participation of all team members is re- 
quired for success on an assigned task, and 
(c) where the team must function within a 
hierarchical structure in which communica- 
tion is narrowly channelized. It must be kept 
in mind, however, that in this study (and in 
most other small group experiments) team 
members performed together on one occasion 
only. The effects of these conditions main- 
tained over extended time might be increased 
or minimized by compensatory or adaptive 
responses of the team. 


Summary 


A simulated small industrial plant was the 
setting for an experiment in which a team of 
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Ss worked together on a manufacturing prob- 
lem. Their assigned task was to produce dif- 
ferent kinds of.toys at a profit. Team pro- 
ductivity, the dependent variable, was opera- 
tionally defined as the amount of net profit 
earned by the team. A three-level hierachical 
group structure was used in which all trans- 
actions between the team and a vice-president 
or a supplier and buyer had to be conducted 
by an appointed department head, and all 
supply orders required prior approval of the 
vice-president. 

Twenty four-man teams were divided into 
ten consecutive team pairs, each member of a 
pair being subjected either to (a) a condition 
under which the team’s expectations of man- 
agement were contradicted by subsequent 
events or (6) a condition under which the 
team’s expectations were confirmed. The hy- 
pothesis that team productivity would be 
greater under the confirmation condition was 
supported by the data. Some theoretical im- 
plications of the experiment were suggested. 
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In recent years there have been two de- 
velopments in the field of evaluation of per- 
formance: the appearance of the critical inci- 
dent technique proposed by Flanagan (1954) 
and the forced-choice type scale described by 
Sisson (1948). A preference for objective 
description of behavior was at the root of the 
former, while in the latter a scale where a 
rater’s enthusiasm for or against the man 
would no longer have free reign was the ob- 
jective. Sisson (1948), in listing the assump- 
tions of the forced-choice method, states that 
differences in competence or efficiency can be 
described in “objective, observable items of 
behavior.” However, the forced-choice scale 


produced by the Personnel Research Section, 
TAGO, includes such items, without elabora- 
tion, as “egotistical,” “nervous,” “easy-going,” 
“cool-headed,” and “anti-social.” Are not the 
meanings of these words open for discussion 


in any group? 

It seemed reasonable to consider combining 
the strengths of both methods; to form tet- 
rads of critical incidents in a forced-choice 
scale covering the areas of behavior in which 
men were to be evaluated. Clearly, because 
an incident is cited by someone as being 
illustrative of critically effective or ineffective 
behavior, it does not mean that it is neces- 
sarily held to be so by others. Fortunately, 
the process of arriving at an index of dis- 
crimination allows us to ascertain the degree 
of consensus. 


Procedure 


The evaluation of the performance of foremen 
was the objective, and the location, a manufacturing 
department of a plant employing approximately 500 
men. After a general briefing of all levels of super- 
vision from assistant foremen to the plant manager, 
confidential interviews were conducted by the writer. 
A random sample of nonsupervisory personnel was 
drawn from all members of the manufacturing de- 
partment, while all incumbents of the other two 
levels were interviewed. No guidance was given the 
person being interviewed other than a definition of 
the material sought, namely, critical incidents de- 


scribing something outstandingly effective or ineffec- 
tive that a foreman had done. On completion of 
the description of the incident, the substance was re- 
corded. An effort was made to disguise the origin 
of the information and to whom it applied, while 
retaining the significance of the incident. Each re- 
corded incident was then checked with the informant 
for accuracy and clarity. The total number of in: 
dents gathered was 691, with representation from all 
levels as presented in Table 1. 

An analysis of the 691 incidents reduced the num- 
ber to 337 on the grounds of overlapping, duplica- 
tion, etc. The degree to which the behavior de- 
scribed in the incidents actually applied to foremen 
was determined. The key was the same reported 
by Sisson (1948): 


. Exceedingly high or highest possible degree 
. To an unusual or outstanding degree 

3. To a typical degree 
. To a limited degree 

5. To a slight degree or not at all 


Superintendents and supervisors (13) were requested 
to think of three foremen that they knew very well 
who were, respectively: 


1. Outstandingly effective in over-all competence 

2. Average, no worse than nor better than most 
in over-all competence 

3. Least effective in over-all competence 


The foremen’s superiors proceeded through the 337 
mimeographed incidents, indicating for each the ap- 
plicability of the item to each of the foremen repre- 
senting the three levels of competence. The stand- 
ard procedure for computing the preference and dis 


Table 1 


Sources of Incidents 


M Number 
Sum of per 


Group Incidents Person 


Supervisors 
Superintendents 
Managers 


Foremen and 
ass’t foremen 


Nonsupervisory 
personnel 
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crimination indices was used (Sisson, 1948). Thx 
cumulative frequency of the discrimination values, 
plotted on probability paper, yielded a straight line. 
Pairs of items were then selected, one having maxi- 
mal discriminative power between effective and in- 
effective foremen and another minimal, but equal 
in preference value to the first. The experimental 
forced-choice scale was composed of twenty pairs of 
effective and thirty pairs of ineffective behaviors. 

The paired-comparison method of ranking foremen 
was used as a criterion which on a test-retest assess- 
ment (month apart) yielded reliabilities of .93, .83, 
and .76 for the three sections with which we were 
concerned. The foremen were rated by their im- 
mediate supervisor and superintendent, with each 
acting independently of the other. Both superiors 
had sufficient contact with their foremen to justify 
this procedure. The criterion was thus the mean 
rank orders assigned by these two men for the two 
occasions. The same supervisors and superintend- 
ents three months later completed the forced-choice 
scale on their own foremen. To test the validity of 
the experimental scale, a rank order coefficient was 
computed to assess the degree of relationship be- 
tween the mean scores of the forced-choice scale 
and mean paired-comparison scores 


Results and Discussion 


None of the relationships between rank or- 
ders based on forced-choice total scores and 
the paired-comparison rankings was found 
even to approach significance in any of the 
three sections. The correlation between the in- 
dividual supervisor’s paired-comparison rank- 
ing and the scores achieved by his men on 
the forced-choice scale also was not signifi- 
cant. 

Following Kelley (1939), the discrimina- 
tive values of all items on the scale were ana- 
lyzed for the top and bottom 27% of fore- 
men rated by the paired-comparison method. 
The validity of each item was determined by 
reference to Flanagan’s (1939) table as given 
in Thorndike (1949, Appendix B). This 
analysis yielded only 7 significant items of 
the 50 that were discriminating according to 
the original assessment. The results were 
convincingly discouraging and no further 
analyses were undertaken. 

One interpretation of the results might be 
that in the establishment of the criterion, the 
rater could allow his partiality free rein, 
whereas the forced-choice scale had been 
successful in preventing this from occurring. 
This is rejected, however, as the scores on the 
forced-choice scale for all foremen showed a 
mean of 26 and a standard deviation of 2.3. 


This leads us to believe that the scale is not 
discriminating, for the most likely value, if 
chance alone were operating, would be 25. 
The raters, when interviewed on their re- 
actions to the use of the scale, threw some 
light on the results. A common reaction was 
their claimed inability to actually decide 
which of the two alternative behaviors fitted 
most or least the man being rated. However, 
it will be recalled that, on arriving at an in- 
dex of discrimination, the rater actually did 
assess the degree of applicability of the be- 
havior to the most effective, average, and 
least effective foreman he knew. It might be 
posited that the success in one and the failure 
in the other arises from halo effect and 
anonymity in the first as compared with 
identification of the rated in the latter. 
However, a simpler explanation demands 
attention. The distribution of discrimination 
indices for the 337 incidents corresponded 
very closely to the normal probability curve. 
We are forced to conclude that the items 
where maximal and minimal discriminative 
indices were obtained were chance values, 
and our judges throughout actually were in- 
capable of assessing the degree of likelihood 
of effective, average and ineffective foremen 
doing that which was described in the critical 
incident. The results obtained in this study 
therefore would seem to discourage the feasi- 
bility of using the level of specificity provided 
by the critical incident technique, despite the 
fact that objective description of behavior for 
many people has preference over inferred per- 
sonal characteristics in contemporary meth- 


odology. 
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The present study is concerned with the 
identification of possible predictors of personal 
adjustment to conditions of Arctic isolation. 
A considerable number of persons serving the 
interests of government and industry are cur- 
rently located in such geographically remote 
and isolated areas. With the advent of space 
exploration, this trend toward isolated living 
will undoubtedly be further augmented. 

Beyond the survival problems posed, life 
in the Arctic can pose serious adjustment 
problems for the individual (Eilbert, Glaser, 
& Hanes, 1957). The Arctic environment is 
restrictive and characterized by deprivation. 
For a considerable portion of the year, living 
is largely confined to indoors. The environ- 
ment deprives the individual of many familiar 
social stimuli, presents him with repetitive 
stimuli often to the point of satiation, and 
places him in a situation that allows him 
essentially no privacy. Under such condi- 
tions, it is not surprising to find that morale 
and work efficiency are often adversely af- 
fected. Finding leisure activities which can 
be related to personal development and satis- 
faction represents a major problem (Air Site 
Project Staff, 1952; Air Site Project Staff & 
Miller, 1952b; Ejilbert, Glaser, & Hanes, 
1957). 

The objective of this study was to identify 
variables for the development of selection 
techniques to minimize the number of per- 
sonal adjustment problems of men at isolated 
Arctic military bases. The Ss of the study 
were 648 enlisted Air Force personnel as- 

1 This research was supported in part by the United 
States Air Force under Contract No. AF 4(657)-74, 
monitored by the Personnel Laboratory, Air Force 
Personnel and Training Research Center, Lackland 
Air Force Base, Texas. Permission is granted for 
reproduction, translation, publication, use, and dis- 
posal in whole and in part by or for the United 
States Government. 

2 The authors wish to express their appreciation to 
Murray Glanzer for his valuable assistance in the 
course of the study. 


signed to eight Arctic bases. The ages of 
the men in these groups ranged from 18 to 
47 years, with a median age of 20 years. 
They had been in their present isolated sta- 
tion from 2 to 12 months, with a mean of 
7 months. Seventy-six per cent were enlisted 
airmen and 24% were noncommissioned 
officers. 


Procedure 


The working definition of Arctic adjustment 
adopted was effectiveness on the job and the ability 
to get along with others. The measure of ability to 
function in the Arctic was rating by immediate 
supervisors. These supervisors nominated the best 
and most poorly adjusted men in their section or 
detachment. A score of plus one and minus one was 
assigned to positive and negative nominations on 
each of nine items of a supervisor rating form, and 
the men of each section were classified on the basis 
of their total scores. Means and standard deviations 
were computed for these supervisor nomination scores 
by section. Men whose scores were more than one 
sigma above or below the mean for their section 
were selected for inclusion in the “well adjusted” 
(high) and ‘poorly adjusted” (low) groups. The 
high and low groups consisted of 112 and 83 men, 
respectively. 

The two groups were compared for differences in 
the general areas of personal background, person 
ality characteristics, and medical complaints. Som 
of the instruments used were exploratory and some 
were selected because they had shown promise in 
previous research (Sharp & Harper, 1953; Sharp, 
Goldstein, & Bolanvich, 1954; Stunkel, Tye, & 
Yaukey, 1952). The following survey and test 
instruments were used: 


1. Biographical Inventory. The 150-item inven- 
tory used contained items that tapped the following 
areas: military history, employment record, educa- 
tional background, family background, organizational 
membership, friendship patterns, marital history, 
sports and hobbies, personal characteristics, and aspi- 
rations and plans 

2. Self-Appraisal Blank. This consisted of 42 
forced-choice quintets of descriptive adjectives and 
phrases. The men were asked to record which item 
of each quintet was most descriptive and which was 
least descriptive of themselves. 

3. Incomplete Sentences Test. The form used 
contained 70 items and was based largely on the 
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Holsopple and Miale Test (1952) and the Incomplete 
Sentences Test for Pilots (American Institute for 
Research, 1953). 

4. Medical Symptoms List. This was a check list 
of 68 items representing a relatively comprehensive 
coverage of medical complaints. 

5. Anxiety Scale. A slightly modified version of 
the Taylor Manifest Anxiety Scale (1953) was used. 

6. Food Aversion List. .Men reported their dis- 
likes to items in a list of foods. This list was based 
on findings of previous investigators (Altus, 1949), 
suggesting that aversions to these foods are related 
to neuroticism. 

7. General Information Test. Earlier studies 
(Sharp & Harper, 1953; Sharp, Goldstein, & Bolan- 
vich, 1954) had indicated that certain types of 
information were related to Arctic adjustment. 
Forty multiple-choice items sampled automotive in- 
formation, sports, literature, and art 

8. Peer Nomination Form. This was composed of 
ten items closely resembling those used in the super- 
visor ratings. Men were asked to nominate the best 
and poorest men in their sections in response to 
questions about job proficiency, ability to get along 
with others, and general adjustment to the Arctic 
and to the Air Force. 


In addition to these measures, Air Force aptitude 
test and job proficiency test scores were obtained 
for the men in the two groups. Medical record 
data for these men showing the number of sick call 
visits and the number of hospitalizations were also 
obtained. 


Results 


Since the purpose of the study is primarily 
exploratory, the levels of statistical signifi- 
cance used for analysis of the data are not 


stringent. The level of confidence adopted 
for evaluating the high and low group differ- 
ences was the 5% level for tests and the 
10% level for test items. No significant dif- 
ferences in the mean scores of the high and 
low criterion groups were found for the Medi- 
cal Symptoms List, Anxiety Scale, Food Aver- 
sion List, or General Information Test. The 
salient findings of survey instruments which 
were found to differentiate the criterion groups 
can be summarized as follows: 

Biographical Inventory. Analysis consisted 
of chi-square comparison of the high and low 
groups’ response choices to each item. Thirty 
of the 150 items were found to yield differ- 
ences between the criterion groups that were 
statistically significant (p< .10). The per- 
sonal history characteristics that were found 
to differentiate members of the poorly ad- 
justed group were: urban background, rela- 
tively high socioeconomic background, and a 


history of minor infractions of military rules 
and regulations. A statistically significant 
nonlinear relationship was found between the 
measure of adjustment and the age at which 
independence from family was achieved, i.e., 
having their own money, buying their own 
clothes, and going on dates. Men who re- 
ported this independence at relatively young 
or old ages were more prone to be in the 
poorly adjusted group. 

Self-Appraisal Blank. Scoring was based 
on an Arctic key which had been empirically 
derived by previous investigators (Sharp et 
al.: 1953, 1954; U. S. Army, The Adjutant 
General’s Office, 1949). Significant differ- 
ences between the mean scores of the high 
and low groups were found (¢ test, p < .01). 
In general, members of the well adjusted 
group tended to describe themselves as con- 
scientious and responsible individuals who 
accept rather than resist authority. The men 
who were judged to be poorly adjusted tended 
to describe themselves in other, less consistent 
terms. 

Incomplete Sentences Test. This test was 
used to investigate the personality and atti- 
tudinal characteristics that might differentiate 
the well adjusted from the poorly adjusted 
group. Specific subtest areas were: attitude 
toward Arctic assignment, fears and com- 
plaints, attitude toward work, interpersonal 
and family relationships, moral and sexual 
attitudes, and goals and aspirations. A brief 
a priori rationale prepared for each item and 
developed on a holdout sample of 100 was 
used as the basis for scoring. Using these 
rationales, responses which were consistent 
with good adjustment were scored two; in- 
determinate responses were scored one; re- 
sponses suggestive of poor adjustment were 
scored zero. Significant differences between 
the mean scores of the high and low groups 
were found (¢ test, p< .01).° Of the two 
groups, the member of the poorly adjusted 
group was found to do more complaining, be 
more fearful of the Arctic, have greater diffi- 
culties in his interpersonal relationships, be 
less inclined to do better than marginal work, 
and be more concerned about the possibility 


3 Since, in this analysis, the direction of the dif 
ference between the two groups was hypothesized, a 
one-tailed test was used. 
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that his Arctic assignment would disrupt his 
relationship with his wife or girl friend. 

Peer Nomination Form. The scoring pro- 
cedure for the Peer Nomination Form was 
similar to that described for the supervisor 
rating form. Significant differences in the 
distributions of peer nomination scores for 
the high and low groups were found (Kol- 
mogorov-Smirnov test, p < .001). Men who 
were identified by their supervisors as being 
well adjusted to the Arctic were also likely 
to be considered well adjusted by the other 
men. Conversely, men identified by the super- 
visors as being poorly adjusted were also 
likely to be so judged by the other men. 

Job Proficiency Tests. Air ‘Force profi- 
ciency test scores were used to compare the 
high and low groups. The mean score of 
men judged to be well adjusted was found to 
be significantly higher (¢ test, p < .001) than 
that of the poorly adjusted group. 

Aptitude Scores. Aptitude scores were ob- 
tained from each airman’s personal records. 
These scores were based on the tests included 
in the Airman Classification Battery and the 
Airman Qualification Examination. An aver- 
age aptitude test score was computed for each 
man. The distributions of the mean aptitude 
test scores of the two criterion groups were 
compared. The difference between the dis- 
tributions of mean aptitude scores for the 
“high” and “low” groups was found to be 
statistically significant (Kolmogorov-Smirnov 
test, P< 05). 

Sick Call Rate. Sick call records were 
obtained for 66 men of the well adjusted and 
53 men of the poorly adjusted group. For 
each group the number of men who made 1, 
2, 3, 4, etc. sick call visits was tabulated. 
It was found that the incidence of these visits 
was greater for the poorly adjusted group 
(Kolmogorov-Smirnov test, p< .05). More 
than half of the members of the well adjusted 
group had had no attendance at sick call. 
The poorly adjusted group, although the 
smaller of the two, accounted for 66% of the 
total number of sick call visits of the com- 
bined groups. 


Discussion 


The data reported here were collected in 
the Arctic. The extent to which similar 


measures collected prior to Arctic assignment 
would be predictive of adjustment is not 
answered by these data. However, the meas- 
ures which are suggested for consideration in 
the development of a selection battery are 
readily obtained before assignment to an iso- 
lated environment, i.e., biographical data, self- 
appraisal, attitudes toward the job situation, 
anxiety about interpersonal relationships, job 
aptitudes, and job proficiency, judgments by 
peers, and medical record. In general these 
results suggest the hypothesis that individuals 
who adjust well to Arctic isolation are indi- 
viduals who also adjust well to their military 
assignments elsewhere. Isolated environments 
probably present a more extreme stimulus 
situation which more frequently and more 
strongly evokes maladjustive behavior. 

The findings with respect to self-appraisal 
and biographical data replicate the results of 
previous studies (Sharp et al., 1954; Stunkel 
et al., 1952). The scoring key for the self- 
appraisal blank previously developed held up 
in the present study. As in the previous 
studies, a biographical information blank 
again showed some promise as a predictor of 
adjustment. No cross validation of the find- 
ings was performed in the study reported here. 

This study has been concerned with one 
type of isolated environment, namely, the 
isolation of groups of men in geographically 
remote areas. The extent to which the find- 
ings obtained under such conditions are gen- 
eralizable to other types of isolation is of 
theoretical and practical interest. Other types 
of isolation arise, for example, as a result of 
cultural differences, communication barriers, 
personal characteristics objectionable to a 
group, and prolonged confinement as in a sub- 
marine or space ship. 

It is further suggested that study of the 
behavior of individuals in an isolated environ- 
ment for the purpose of minimizing what is 
defined as poor adjustment in that situation 
can involve three approaches: (a) Selection 
as indicated in this study. (6) Training— 
observations made in the course of this study 
indicate that appropriate prior indoctrination 
might facilitate adjustment. This could con- 
sist of “familiarization training” 
the characteristics of the 


concerning 


new environment 
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and personal counseling in the use of leisure 
time. (c) Group Structure and Manage- 
ment—the kind of leadership and interper- 
sonal structure which can mitigate any un- 
desirable effects of isolated living needs to be 
studied further (Air Site Project Staff & 
Miller, 1952a). Continued studies of the 


effects of isolation and sensory deprivation 
should uncover variables which need to be 
manipulated in designing conditions of isola- 
tion favorable to adjustment. 


Received October 21, 1958. 


References 


Air Site Project Staff, & Miller, D. C. Human rela- 
tions at A. C. & W. sites. I. Summary of find- 
ings: An interim report of the first year’s work 


Hum. Resources Res. Inst. Rep., 1952, No. HR-8. 


(a) 

Air Site Project Staff, & Miller, D. C. Human rela- 
tions at A. C. & W. sites. II. Personnel problems 
Hum. Resources Res. Inst. Rep., 1952, No. HR-9 
(b) 

Air Site Project Staff, & Miller, D. C. Human rela 
tions at A. C. & W. sites. III. Needs of site per 
sonnel. Hum. Resources Res. Inst. Rep., 1952, No 
HR-10. (c) 


Leo R. Eilbert and Robert Glaser 


Adjustment and food aversions among 


Altus, W. D. 
consult. Psychol., 1949, 13, 


Army illiterates. J. 
429-432. 

American Institute for Research. 
tences Test for Pilots—Form D. 
thor, 1953. 

Eilbert, L. R., Glaser, R., & Hanes, R. M. Research 
on the feasibility of selection of personnel for duty 
at isolated stations. USAF Personnel Train. Res. 
Cent. tech. Rep., 1957, No. 57-4. 

Holsopple, J. Q., & Miale, Florence R. Sentence 
completion—A projective method for the study of 
personality. Springfield, Ill.: Charles C Thomas, 
1952. 

Sharp, L. H., & Harper, Bertha. Selection of quar- 
termaster personnel for Arctic assignment. USA 
TAGO Personnel Res. Br. Rep., 1953, No. 999. 

Sharp, L. H., Goldstein, L. G., & Bolanvich, D. J 
Further study on selection of quartermaster per- 
sonnel for Arctic assignment. USA TAGO Per- 
sonnel Res. Br. Rep., 1954, No. 1089. 

Stunkel, Eva R., Tye, V. M., & Yaukey, D. W 
Validation of experimental selection instruments 
for Arctic service. USA TAGO Personnel Res. 
Sect. Rep., 1952, No. 945. 

Taylor, Janet A. A personality scale of manifest 

anxiety. J. abnorm. soc. Psychol., 1953, 48, 285 
290. 
J. S. Army, The Adjutant General's Office. Con 
struction of a self-description blank for Arctic as 
signment. USA TAGO Personnel Res. Sect. Rep., 
1949, No. 835. 


Incomplete Sen- 
Pittsburgh: Au- 





Journal of Applied Psychology 
Vol. 43, No. 4, 1959 


STUDIES OF TRANSPARENCY IN FORCED-CHOICE 
SCALES: 


I. EVIDENCE OF TRANSPARENCY 


HOWARD MAHER 


University of Pennsylvania 


Mais in 1951 reported that a forced-choice, 
self confidence key developed from Jurgen- 
sen’s Classification Inventory was found to be 
fakable. In 1953, Longstaff and Jurgensen 
substantiated the finding. In a study de- 
signed to be more like the situation in which 
the test was intended to be used, e.g., the in- 
dustrial situation, they found important re- 
sponse changes. More recently, Borislow 
(1958) has demonstrated fakability in the 
Edwards PPS. 

In 1954, Schutter and Maher reported the 
development of a forced-choice study activity 
questionnaire (SAQ). Maher * has found that 
this test holds its validity for a university 
situation different from the one in which the 
test was constructed. In the original article, 
Schutter and Maher cited Scates’ (1949) re- 


view of the Wrenn Study Habits Inventory in 
which he noted that the inventory contained 
a number of easy “outs” for the student. The 
authors went on to propose the forced-choice 
test as a means of eliminating such “outs.” 
The present study was designed to test for 


transparency in SAQ. It has the advantage 
of not having to generalize from students’ 
simulations of perhaps unfamiliar situations 
as in the Longstaff and Jurgensen study. 
Rather the student, here, is asked to “beat 
the test” in a situation he has experienced, 
i.e., the academic one. The study also was 
undertaken, to be honest, because the author 
doubted Longstaff and Jurgensen’s (1953) 
statement that there “. .. is no reason to 
believe that different results |fakability | 
would be obtained from any other forced- 
choice personality test” (p. 89). 

Finally, there is an attempt to see whether 
instructions, specifically a study skills lecture. 
would increase transparency. If so, the test 
could not be considered of value if adminis- 


1 Article in preparation 
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tered following “How-to-Study” 
programs. 


orientation 


Procedure 


The 30-block forced-choice form (SAQ), fully de- 
scribed by Schutter and Maher (1956), was ad- 
ministered to 106 students in each'of two introduc- 
tory psychology lectures. The instructions did not 
especially emphasize the honesty aspect. Instead, 
the lecturer mentioned only that this was a research 
project and should be taken seriously, specifically: 

“We are conducting a research project and would 
.ike you to participate in the experiment. If you do 
not wish to take part in the study, which requires 
the serious and accurate completion of a short ques 
tionnaire, you may leave now.” 

For those participating, a laboratory 
promised. 


credit was 
As the questionnaire was passed out, the 
need for seriousness of response was again stressed 
For recording answers, the students used IBM elec 
trographic pencils and answer sheets. 

Three days following this, i., at the next class 
meeting, one of the lecture sections was given a 
study skills lecture, a standard procedure in the 
course. The study was timed to straddle this regu- 
larly scheduled lecture. In the lecture, the material 
to be covered is also prescribed; in fact, it is “tied 
down” by a mimeographed handout provided for the 
students. Thus no attempt was made to avoid or to 
cover the items in SAQ. The handout and lecture 
resemble Hilgard’s (1957) “Management of Learn- 
ing” chapter (Ch. 11). The students were told they 
would be held accountable for this material at the 
next meeting of the class. This group is hereafter 
designated as BI—Beat with Instructions. 

The procedure was also arranged in point of time 
so that the other lecture section would be attending 
a scheduled chapel meeting at the time (also three 
days later) when they would normally have a lec- 
ture. Thus, for psychology class, at least, nothing 
intervenes to make for disturbance of the remainder 
of the design. This group becomes BNI—Beat, No 
Instructions; i.e., they did not have the study skills 
lecture, prior to the next step 

One week after the original administration (“hon 
esty” condition), SAQ was readministered. This 
time, the students were asked to pretend that they 
were applying for admission to the university and 
that they would only be admitted if they obtained a 
high score. No attempt was made to define a “high 
score,” the procedure being equivalent to Longstaff 
and Jurgensen’s “fake over-all good” score. Fur 
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thermore, the investigator told the students that this 
was a forced-choice form deliberately designed to 
eliminate “beating.” Finally, the investigator at- 
tempted to motivate the students by saying that he 
did not think them capable of “spotting” the cor- 
rect (keyed) answers. 

Following this, the papers were scored (separately 
for the BI and BNI groups and for the two ad- 
ministrations) on the IBM test scoring machine. 


Results 


Is SAQ transparent? The question was 
handled by application of the signed-ranks 
test on individuals’ difference scores from the 
“honest” to the “beat” condition. In so do- 
ing, the BI and BNI conditions were com- 
bined, the test thus being made for trans- 
parency regardless of the effect of having or 
not having instructions. Even without sta- 
tistical test, the answer seems apparent. For 
212 difference scores, only 24 are negative, 
ie., 24 Ss have scores lower on the “beat” 
attempt than on the “honest” condition. 
Three Ss had no change from the one condi- 
tion to the other. The other 185 cases all 
had score increases. For the total group the 
mean increase in scores is 14.2 points, equiva- 
lent to one sigma under “honest” conditions. 
(The possible score range on the test is from 
—41 to 38, with the mean score under the 
honest condition being 2.4.) The largest gain 
for any S is 48 points. The signed-ranks test 
yields a p value (two-tailed test) beyond the 
.001 level—reason for rejecting the hypothe- 
sis of nontransparency in SAQ. 

To answer the question of the effect of in- 
struction upon transparency, it is first neces- 
sary to demonstrate that the BI and BNI 
groups are comparable under the “honest” 
condition, i.e., that the two groups start from 
comparable scores. Table 1 shows that this 
is so. For the values in Table 1, ¢ = 1.58, 
lacking significance at the .05 level. 

With this comparable starting point estab- 


Table 1 


Means and Standard Deviations for the BNI and BI 
Groups under the ‘‘Honesty’’ Condition 


M SD N 


12.84 106 
12.79 106 


lished, it is possible to test the gains for the 
BI vs. the BNI group. The BI group mean 
gain is 12.9 points (SD = 13.34). The group 
with no instruction (BNI), on the other hand, 
gained 15.4 points (SD = 12.95). The dif- 
ference is in the direction opposite from that 
hypothesized, i.e., it had been expected that 
if there were any difference in gain scores it 
would favor the group with instructions. The 
Mann-Whitney U (two-tailed) provides p = 
.09. Thus one must conclude that instruc- 
tions have no additional effect upon trans- 
parency. 

What is the result of this transparency? 
The product-moment correlation between 
“honest” and “beat” scores for the BNI 
group is r= .09; for the BI group r = .01. 
Of greatest importance, the validity of the 
test under the “beat” condition (NV = 95 for 
whom grade point averages are available) is 
r= .10 for BNI; r = .07 for BI (N = 94). 
Under the honest conditions validities are .45 
for BNI and .56 for BI. 


Discussion 


There is shown here a condition similar 
to those found by others. Thus, although 
forced-choice scales are specifically designed 
to greatly retard faking, there is demon- 
strated that, for., still another forced-choice 
form, statistically. significant increases occur 
when these devices are subjected to pressure. 
In this instance, the conditions are ones fa- 
miliar to the Ss used, whereas, in the Long- 
staff and Jurgensen study, students are asked 
to simulate a perhaps unfamiliar condition. 
Most important, in this study, it is possible 
to study directly the effects of such “beating” 
attempts upon the validity of the test. 

The average gain found in this study was 
14 points on a possible 79-point range, where 
under the honest condition SAQ has usually 
yielded mean scores at approximately the 
midpoint of this range. There is demon- 
strated, here, not only a statistically signifi- 
cant gain from “honest” to “beat” conditions 
but one of practical significance as well. Thus 
the mean shifts approximately 35% of the 
distance from the midpoint to the top of the 
possible score range. 

One could argue that even greater distor- 
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tion would be obtained with the usual study 
habits measurement device, i.e., perhaps these 
instruments are virtually perfectly transpar- 
ent and scores could be moved to the very 
top of the range. However, one cannot ignore 
the fact that under “beat” conditions SAQ 
loses all validity. The mere fact that the 
forced-choice design may have had some re- 
tarding effect on the mean shift, then, does 
not preserve its validity. With this finding, 
also, it becomes clear that we could not 
merely construct norms for “no pressure” 
conditions (e.g., counseling and guidance) 
and others for administrative pressure condi- 
tions (as academic selection or sectioning). 
The present form of SAQ, thus, will be of 
value only in those cases in which such pres- 
sures do not exist and where honesty of re- 
sponse may be safely assumed. 

Another pertinent finding is that transpar- 
ency is not increased via a rather typical 
study skills lecture. As a matter of fact, 
there is a slight, but not statistically signifi- 
cant, tendency for a group not having such 
information to “beat” the test more than 
those having the lecture. The finding of no 
difference attributable to information is not 
entirely surprising. For one thing, not all of 
the weighted items in SAQ are covered in 
even more extensive study orientation pro- 
grams or texts. More to the point, not all of 
the ‘advice’ given in such sources receives 
weight in SAQ’s scoring system. The author 
believes, furthermore, that the lecture given 
was typical enough so that no general infor- 
mation on “how to study” would affect scores 
on the test. Thus we appear to have the ad- 
ditional finding that (again), assuming con- 
ditions of honesty, SAQ could be used fol- 
lowing such orientation and would not as a 
direct consequence be contaminated. 

The major point remains, however; the au- 
thor must agree with Longstaff and Jurgen- 
sen that any forced-choice scale must be con- 
sidered potentially beatable. We are, there- 
fore, faced, in the future, with demonstrating 
the se.urces of transparency in this and other 
scales. Such is beyond the intent of this pres- 
ent siud,, it being the purpose here to in- 
vestigate the presence of, rather than the 
sources of, transparency. 

However, several transparency hypotheses 


are here set forth. It is the intention of the 
author to systematically examine these in fu- 
ture investigations. 

1. Schutter and Maher (1956), in the 
original article, noted a strange operation of 
the forced-choice form used with SAQ, i.e. 
the Richardson system. “. . . it is clear that 
they [students] do not resist unfavorable al- 
ternatives to the extent that raters do. These 
latter deny favorable alternatives or admit 
unfavorable ones less frequently. In this case 
32% of the responses are of this nature’’ (p. 
257). 

Certainly this finding would make a form 
extremely “beatable.” All the student has to 
do, after denying favorable statements or ad- 
mitting to unfavorable statements in the 
“honest” condition, is shift away from such 
responses in the “beat” condition. 

As a rough check on this, the first 14 items 
in SAQ were item analyzed using the IBM 
Graphic Item Counter. The analysis was 
performed for the response “most descrip- 
tive” only. For these items there was a total 
group gain of 1250 points on the most de- 
scriptive side from the “honest”’ to the “beat” 
condition. Of this gain, 33% is accounted 
for by shifts from a negative statement listed 
as “most descriptive.” In other words, for 
these items 33% of the transparency is ac- 
counted for by use of the Richardson system 
in a situation in which it rather obviously 
does not operate as expecied. The finding 
should be fully investigated for all items on 
both the most descriptive and least descrip- 
tive responses. For the present, however, it 
would appear that the Richardson format, in 
the study skills area of measurement, con- 
tributes to transparency. It is also of inter- 
est to recall that Highland and Berkshire 
(1951, p. 35) found a Richardson rating form 
and a similar one less bias-resistant than other 
forced-choice forms in which favorable and 
unfavorable statements are not mixed in the 
item. 

2. Highland and Berkshire (1951), in their 
excellent investigation of some of the basics 
of forced-choice methodology, have speculated 
on the possibility that, in the pairing of items 
(equal on some appearance index, e.g., pref- 
erence index; unequal on discrimination in- 
dex), large discrimination index differences, 
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while making for increased validity, might 
also make for increased transparency. 

3. To the best of the author’s knowledge, 
appearance pairings have traditionally been 


pairings on mean appearance scores. The 
standard deviations of the appearance indices 
apparently have been ignored. 

4. Appearance indices are traditionally es- 
tablished out of the context in which they 
are eventually to be used. Often a question- 
naire stage is first resorted to, each item in 
the questionnaire being rated, one by one, on 
some appearance scale. In the forced-choice 
stage, on the other hand, they appear more 
directly in comparison with one another, in 
a particular context or “field.” It might be 
possible to pretest such “field” conditions 
through such treatments as paired-compari- 
sons ratings or some variation of these. 

Finally, although the author must now 
agree with the conclusion of Longstaff and 
Jurgensen that all forced-choice devices could 
be transparent, he would prefer to emphasize 
the fact that they are not necessarily so. But 
only systematic investigation will lead to the 
sources of transparency. These should yield 
generalizable information on the construction 
of valid and nontransparent scales. 


Summary 


There is demonstrated here for a study ac- 
tivity questionnaire (SAQ) a condition found 
by others, i.e., forced-choice forms are, to a 
serious extent, transparent. In this study 
there has been found the statistically signifi- 
cant upward shift in mean score from “hon- 
est” to “beat” conditions seen in other in- 
vestigations. 

For the first time, to the best of the au- 
thor’s knowledge, the study shows directly 
the serious effect of such a shift, i.e., validity 
disappears under “beat” conditions. SAQ 


could still be of value in those instances in 
which 
exist. 
ticular 


pressures to increase scores do not 
In this event, however, there is no par- 
advantage to SAQ’s forced-choice 
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format and its use would have to be deter- 
mined solely in terms of its validity cveffi- 
cients in “honesty” situations. 

The author would not conclude, as have 
others (Longstaff & Jurgensen, 1953), that all 
forced-choice tests are transparent and that 
other techniques must be devised. Such is 
equivalent to tossing out the baby with the 
bath. Instead, he feels that we must seek 
for the sources of transparency in these other- 
wise promising instruments. Several sources 
are hypothesized with the hope that, once sys- 
tematically investigated, they will enable us 
to construct truly nontransparent question- 
naires. One such source is partially investi- 
gated here, indicating that the Richardson 
and similar formats may increase the trans- 
parency in the framework of measurement for 
which SAQ is intended. 

Finally, the study demonstrates that a 
“how to study” lecture does not further in- 
crease the transparency of the study ques- 
tionnaire. 
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OF ATTITUDE INTENSITY AND PERSONAL 
INVOLVEMENT ' 


LANE H. RILAND 


Eastman Kodak Company, Rochester, New York 


There has been little or no published re- 
searck on Guttman’s fourth principal com- 
poneni of scalable attitudes, involution, other 
than his article (1954) describing the theory 
in postulating this component. In an earlier 
article, Katz (1944) stated that in the at- 
tempt to more closely define the attitude and 
predict responses questions of “personal in- 
volvement” were used but were not successful 
in predicting responses to related content 
items. Guttman, in his study, measured per- 
sonal involvement in listening to broadcasts 
of the Voice of Israel by such questions as 
“How often do you listen to the VOI?” and 
“Do you make it a point to open the radio at 
certain hours for certain programs?” These 
data were utilized to determine the re- 
spondent’s “involution” regarding this area of 
attitude. 

His hypothesis in relating these data to 
content responses was that the scores on per- 
sonal involvement in listening to the broad- 
casts, when plotted against the content scale 
scores on favorableness (“Do you think the 
broadcasts are good in general?’’), would 
yield an M-shaped curve with the middle low 
point or zero-point of involution or involve- 
ment at the same point along the content 
continuum as that of the zero-point of the 
intensity curve for the same data. The inten- 
sity component was measured by questions 
regarding how strongly the respondents felt 
about their expressed attitudes of favorable- 
ness or unfavorableness. 

As noted by Guttman, prior to the time 
when the data on involution were processed, 
his associates predicted that the resultant 
curve of involution would assume the same 

1This paper is based on a portion of a doctoral 
dissertation, Pennsylvania State University, 1958 
under the direction of Lester P. Guest. The financial 
assistance of the Hamilton Watch Co. in conducting 
this study is gratefully acknowledged. The author is 
also indebted to C. C. Upshall and S. M. Newhall 
for their helpful criticism of the initial draft. 
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characteristic shape as that of the intensity 
component, i.e., U or J shaped, with those 
respondents most and least favorable being 
the most involved personally in their atti- 
tudes. The results of the study, however, 
were in agreement with Guttman’s theoretical 
prediction in that he found the M-shaped 
relationship, indicating that the respondents 
who were both most and least favorable in 
their attitudes toward the broadcasts were 
also the least personally involved in their 
listening habits regarding these broadcasts. 
He stated that this showed a “prejudiced” or 
“unreasoned” attitude. 

Evidence pointing to the possible confusion 
of these two components of intensity and 
involution may be found in the discussion of 
the problem of error in the relationship of 
content and intensity by Guttman and Such- 
man in an earlier article (1947). They de- 
scribe situations where respondents having 
answered “Undecided” to a content item 
would answer “Very strongly” to the stand- 
ard intensity item—‘How strongly do you 
feel about your answer to that (content) 
question ?”—which followed the content item. 
When asked why they answered in this way, 
many respondents said they meant the prob- 
lem in question was ‘Very important.’ As 
suming that, as in all questionnaire research, 
differing verbal habits may cause respondents 
to think quite differently about what a certain 
term means, there still appears to be some 
indication that the usual intensity question 
“How strongly do you feel about your an- 
swer?”—may often be confused with the 
typical item used to measure Guttman’s com- 
ponent of involution, e.g., “How important is 
this to you personally?” 


Problem 


The problem in this investigation was to 
determine whether the involution-type metric 
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would yield the M-shaped curve hypothesized 
by Guttman, and also to study the relation- 
ship of the intensity function and that of 
involution. In this study it was hypothesized 
that the functions of these two components 
were more closely related from the standpoint 
of respondent perception than indicated by 
Guttman in his primarily mathematical for- 
mulation (1954). The hypothesis was that 
when involution scores were plotted against 
content scores, the resultant regression would 
not be M shaped, but would more closely ap- 
proach the U- or J-shaped curve characteristic 
of intensity scores when plotted against con- 
tent scores. Those respondents most involved 
in their attitudes would be the most and least 
favorable on content. A significant, positive 


statistical relationship between the intensity 
and involution scores was also hypothesized. 


Procedure 


The intensity and involution scores derived 
from the responses of a random sample of 388 resi- 


were 
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dents of a central Pennsylvania community to a 
questionnaire regarding their attitudes toward a local 
company. Using Guttman scale analysis, a six-item, 
“general attitude” scale was developed with a repro- 
ducibility of .88. Although the number of scale 
items was somewhat below the number recommended 
by Guttman for unidimensional scales, it was felt 
that their reliability had been adequately established 
according to Guttman theory in that they scaled for 
the entire survey population. 

Intensity was measured by three, two-part inten- 
sity items attached to content items which met the 
Guttman requirements for scalability over the entire 
sample. The intensity score consisted of the sum 
of the scale values of the responses to these three 
items. In this two-part technique, a separate iter 
followed the content item and asked the respondent 
“how strongly” he felt about his answer to the pre- 
ceding content item. This technique was felt to be 
superior to the “foldover” technique (Suchman, 1950) 
sometimes used in intensity measurement, as it is 
a more independent measure. 

Involution was measured by three items also at- 
tached to scalable content items. These items were 
to obtain an index the respondents’ 
personal involvement in his thinking about matters 
concerning the whether they 


designed of 


company (a) were 


Table 1 


Relationship of General Content and Two-part Intensity Scores 


N= 


388 


General Content Score 


Intensity 
Score 2 3 


12 
11 
10 
9 
8 


ms oo = Ww 


j 


Cum. 9%, 
Midpoint of Content 
Percentile 


Median of Intensity 


Percentile 59 48 34 


* Cell containing median for eacl 


content score 


4 ; f Cum. © 


100 
97 
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Table 2 
Relationship of General Content and Involution Score 


N = 388 


Involution 
Score 3 


15 
14 
13 
12 
11 


cou Ww hw &H WwW 
Ff 


— wee se 


Cum. 
Midpoint of Content 
Percentile 


Median of Involution 
Percentile 42 


* Cell containing median for each content score 


“interested in” or “kept abreast” of development 
there, (b) they were “personally concerned” with 
future developments there, and (c) the matter of 
the company “keeping up with other companies in 
town” was important to them personally. The scale 
scores on these three items were totaled to obtain 
the involution score 

The median intensity score percentiles were then 
plotted against the midpoints of the content score 
percentiles on the general attitude toward the com- 
pany, as described by Suchman (1950), resulting in 
the intensity curve. The involution were 
plotted against the content scores in the same fashion 
to cbtain the involution curve. 

4 chi-square test of significance was utilized and 
a cocficient of contingency was calculated to deter- 
mine the statistical relationship of the intensity and 
involution scores 


scores 


Results 


Table 1 shows the relationship of the gen- 
eral attitude content scores and the two-part 
intensity scores from which the _ intensity 


General Content Score 


4 


ce anN 


—nIw oo 


w 


63.5 90.0 


curve was plotted. This curve 
Fig. 1. 

The relationship of the general attitude 
content scores and the involution 
shown in Table 2. 

Figure 1 shows the intensity curve in this 
study approximates the U or J shape typically 
found in the Guttman analysis of intensity. 
The zero-point, or point of indifference, falls 
at approximately the 25th content percentile. 
This, according to Guttman theory, is the 
point which separates the favorable and un- 
favorable respondents on the content con- 
tinuum (Guttman & Suchman, 1947). 

Figure 2 shows the involution curve found 
in this study as well as the M-shaped curve 
of involution postulated by Guttman. It can 
be seen that the involution curve in this study 
does not assume the shape! postulated by 
Guttman, and that in this study the respond- 


appears in 


scores is 
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. 1. Intensity curve showing the relation of inten- 
sity scores to general attitude content scores. 








ents who were most and least favorable were 
on the average more involved rather than 
least involved, as predicted by his theoretical 
formulation. 

The intensity and involution curves found 
were superimposed on the same matrix in 
Fig. 3. This comparison shows that these 
two components had quite similar regressions 
when plotted against the general attitude con- 
tent percentiles. 


In his study, Guttman presented no statis- 
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Fic. 2. Involution curve showing the relation of 
involution scores to general attitude content scores 
as well as the curve postulated by Guttman. 
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Fic. 3. Comparison of the curves of intensity and 


involution on the same general attitude content 
v.atrix. 





tical correlation of the intensity and involu- 
tion scores found, but preferred only to com- 
pare the coincidence of zero-points and the 
locations of the bending points of the two 
curves. Table 3 shows that in this study 
the coefficient of contingency of .28, although 
not extremely high, was very significant. The 
maximum C possible for this size table is .82 
(Siegel, 1956). 
Discussion 


The hypothesis that the curves of intensity 
and involution would closely approximate 
each other appeared to be generally substan- 
tiated. This finding is not in agreement with 
Guttman’s original findings in postulating 


Table 3 
Statistical Relationship of Intensity and 
Involution Scores 
N = 388 


Involution Score 


7-11 12- 15 


Intensity 





62 127 
78 74 
14 5 


Chi square = 34.53 

df = 4; significant at less than ee .001 level. 
Coefficient of contingency = 

Maximum C possible for 3 x yon = §2. 
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these two supposedly different components of 
scalable attitudes. In the present study the 
respondents who were most involved were 
those who were most and least favorable in 
their attitudes toward the company. 

In his original article, Guttman (1954) 
stated that those who were most and least 
favorable, and had little or no personal in- 
volvement in the VOI broadcasts toward 
which the attitude was expressed, showed a 
“prejudiced” or “unreasoned’’ attitude. He 
stated that this type of “prejudice” related 
to the lack of personal involvement in radio 
may be called that due to the “absence of 
reasoning from lack of personal contact with 
radio.” He also stated that there may be 
another type of “prejudice” which fits his idea 
of involution for. other types of data which 
might be called “'cessation of reasoning even 
though there is close contact.” This thinking 
was based on the results of a previous study, 
also reported in his original article (1954), 
which resulted in the postulation of his third 
component of “closure” measuring generally 
whether or not the respondent had “definitely 
made up his mind” on the attitude issue in 
question. The aspect of closure was not in- 
In the origi- 


vestigated in this present study. 
nal study, Guttman defined respondent in- 
volution as “turning the attitude over and 
over within himself,” but measured it as per- 
sonal involvement, as was done in this present 
study. 

This “prejudice principle” was not evident 


in this study. The more favorable respond- 
ents did appear to be slightly more involved 
than those least favorable, but the findings 
do not indicate that even these least favorable 
respondents were the least involved in their 
attitudes. 

The zero-points of both the intensity and 
involution curves do generally coincide as 
postulated by Guttman, however. 

A perfect statistical correlation between 
intensity and involution was not expected, 
but as Guttman’s associates in the Israel 
study (1954) predicted, and as Guttman and 
Suchman (1947) found in their analysis of 
intensity error, there appears to be some 
question as to the distinctness of these two 
components, at least as measured by items 
worded in the suggested fashion. Perhaps re- 


searchers using these two components should 
undertake a more intensive analysis of the 
wording of these items, particularly with re- 
gard to personal involvement, in order to 
more clearly define the interrelationships be- 
tween them, if they are in fact measuring 
distinct components of scalable attitudes as 
postulated by Guttman. 


Summary and Conclusions 


A random sample of 388 residents of a 
central Pennsylvania community were sur- 
veyed regarding their attitudes toward a local 
company. Guttman scaling techniques were 
applied, and a six-item scale of “general atti- 
tude” resulted, with a reproducibility of .88. 
These six items scaled for the entire survey 
sample. The respondents’ attitude intensity 
and personal involvement (involution) in 
their attitudes toward the company were 
measured and analyzed by the techniques sug- 
gested by Guttman to test his theory that 
the intensity and involution components would 
show U- and M-shaped regressions, respec- 
tively, when plotted against general attitude 
content (favorableness). 

It was hypothesized that the intensity and 
involution regressions would show similar 
curves, and that those respondents who were 
the most and least favorable toward the com- 
pany would also be the most involved in their 
attitudes toward the company, and not the 
least involved as predicted by Guttman. It 
was also hypothesized that there would be a 
significant, positive statistical relationship be- 
tween the sco.es on intensity and involution. 

In light of the results of this study, the 
following conclusions appear to be justified: 

1. The regression of involution § scores 
against general attitude content scores re- 
sulted in a curve quite similar in shape to 
that of the intensity scores when plotted 
against content scores, indicating that those 
respondents most involved in their attitudes 
toward the company were on the average the 
most and least favorable in their attitudes. 

2. There was a very significant, although 
not extremely high, positive relationship be- 
tween the intensity of the attitudes expressed 
and personal involvement in the 
toward the company. 


attitudes 
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3. More research is needed on these two 


components to more clearly define the inter- 
relationships between them. 
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